Can command generation replace intent classification in dialogue systems?
Explores whether generating pragmatic commands in a DSL could outperform traditional intent classification for task-oriented dialogue, particularly regarding training data needs and scalability.
The dominant industrial approach to task-oriented dialogue uses intent-based NLU: classify each user message into a predefined intent, extract slot values, and pass these to a dialogue manager. This paper introduces a fundamental architectural shift: replace intent classification with command generation in a domain-specific language (DSL).
The distinction is between semantics and pragmatics. "While NLU systems output intents and entities representing the semantics of a message, DU outputs a sequence of commands representing the pragmatics of how the user wants to progress the conversation." Intent classification asks "what does the user mean?" — command generation asks "what does the user want to happen next?"
Key advantages over intent-based approaches:
Context-dependent by design. NLU interprets one message in isolation. Dialogue Understanding considers the full running transcript plus the assistant's business logic. Flow definitions and conversation state provide additional context for understanding.
No training data required. Flow definitions (business logic as code) are all that developers specify. The LLM's in-context learning handles language understanding without annotated datasets — eliminating the expensive data collection that intent-based systems require.
Scales without degradation. Intent taxonomies become unmanageable at hundreds of intents: "difficult to remember and reason about," error-prone to modify, context-insensitive. Command generation scales naturally because new flows add new possible commands without reclassifying existing ones.
Handles repair natively. Corrections, digressions, interruptions, and cancellations are handled through conversation repair patterns. Developers specify only the "happy path" — repair is built into the architecture, not bolted on.
Coreference resolution is implicit. By including the full conversation transcript in the LLM prompt, commands are generated with arguments already fully resolved. No separate coreference module needed.
The limitation of intent classification is precisely that it treats understanding as classification: "messages are 'understood' by assigning them to a predefined intent." But user utterances often don't correspond to specific tasks — "I lost my wallet" could map to replace card, block card, or freeze card. Command generation can express this ambiguity through a Clarify command, while intent classification forces a premature decision.
Since When should AI agents ask users instead of just searching?, the Clarify command in this architecture is the engineering implementation of CA's insert-expansion: the system recognizes ambiguity and initiates a sub-sequence to resolve it before proceeding. Since Why can't conversational AI agents take the initiative?, this architecture gives the agent a structured mechanism for initiative-taking within the bounds of defined business logic.
Inquiring lines that use this note as a source 20
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
- Can better AI interfaces eliminate the attention cost of prompt composition and evaluation?
- Why does dialogue-shaped text fail to produce dialogue-like operations in practice?
- Can systems guide users adaptively without imposing predetermined dialogue structures?
- Can real-time detection identify when users have incomplete or underdeveloped intent?
- What makes intent taxonomies unmanageable at hundreds of intents?
- How do probabilistic dialogue systems handle ASR errors differently?
- Can generative interfaces help users articulate what they actually want?
- What types of tasks benefit most from dynamically generated interfaces?
- How does API-first interaction compare to generative interface approaches?
- Can prompt engineering overcome the gulf between user intent and AI interpretation?
- How should task-oriented and socially-oriented dialogue acts receive different training signals?
- Why do traditional interfaces bypass the intention formation problem that language models expose?
- Can topic planning and response generation reduce dialogue turns?
- What data would be needed to train proactive conversational systems?
- How should headers index procedural intent differently from keyword chunking?
- How do discourse relation types improve dialogue beyond sentence-level semantic matching?
- Can offline RL and pragmatic inference together improve dialogue agent reliability?
- Why are task-oriented dialogue datasets systematically underrepresenting human proactive behavior?
- Can a separate mediator layer improve intent understanding before task execution?
- What makes protocols better than free-form prompting for tool coordination?
Related concepts in this collection 4
This note in its neighbourhood — explore the map, then jump to a related concept in the list below.
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
-
When should AI agents ask users instead of just searching?
Explores whether tool-enabled LLMs should probe users for clarification when uncertain, rather than silently chaining tool calls that drift from intent. Examines conversation analysis patterns as a formal alternative.
Clarify command as engineering implementation of insert-expansions
-
Why can't conversational AI agents take the initiative?
Explores whether current LLMs lack the structural ability to lead conversations, set goals, or anticipate user needs—and what architectural changes might enable proactive dialogue.
command generation gives agents structured initiative within business logic
-
Can dialogue planning balance fast responses with strategic depth?
Can a system use quick instinctive responses for familiar conversation contexts while activating deeper planning only when uncertainty demands it? This explores whether adaptive computation improves dialogue goal-reaching.
command generation is the System 1 fast path; complex planning activates when commands don't resolve
-
Why do protocol-based tool integrations fail in production workflows?
Explores whether standardized tool protocols like MCP introduce non-determinism that undermines agent reliability, and what causes ambiguous tool selection in production systems.
command generation + deterministic business logic execution avoids the MCP non-determinism problem
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- Intent Mismatch Causes LLMs to Get Lost in Multi-Turn Conversation
- DiaSynth: Synthetic Dialogue Generation Framework for Low Resource Dialogue Applications
- Improving Generalization in Task-oriented Dialogues with Workflows and Action Plans
- Are LLMs All You Need for Task-Oriented Dialogue?
- Task-Oriented Dialogue as Dataflow Synthesis
- CONSCENDI: A Contrastive and Scenario-Guided Distillation Approach to Guardrail Models for Virtual Assistants
- CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society
- Dynamic Task-Oriented Dialogue: A Comparative Study of Llama-2 and Bert in Slot Value Generation
Original note title
dialogue understanding reframed as command generation replaces intent classification — outputting pragmatics instead of semantics eliminates training data requirements