How do dialogue dimensions predict explanation success across different exchanges?
This explores the finding that explanations don't succeed because of what's said but because of measurable dimensions of how the conversation moves — and how the corpus generalizes that pattern from explanation to dialogue success broadly.
This explores the idea that whether an explanation lands depends less on its content than on a few measurable dimensions of the exchange itself — and the corpus turns out to have a lot to say about that. The anchor result comes from analyzing 399 everyday explanations, which found that three interacting dimensions — topic relation, dialogue act, and explanation move — jointly predict whether understanding actually happens What makes explanations work in real conversation?. The key word is *jointly*: explanations are co-constructed through back-and-forth, not delivered. This directly challenges how today's LLMs generate explanations as polished monologues.
The same insight shows up under different names elsewhere. One line of work reframes explainable AI entirely as a communication problem rather than a transparency problem: explanation quality lives not in the explanation but in a triad of who presents it, how it's framed, and what role the recipient plays What if XAI is fundamentally a communication problem?. That's the same move — success is a property of the exchange, not the artifact. And it generalizes beyond explanation: one striking result found that structural features of a conversation alone predicted dialogue satisfaction at 68% accuracy, nearly matching a 70% content-based baseline, with a hybrid hitting 80% Can conversation structure predict dialogue success better than content?. How you talk rivals what you say.
If dimensions matter, the next question is which ones, and the corpus warns they aren't interchangeable. A systematic review found lexical alignment drives task efficiency and comprehension, while emotional and prosodic alignment drive warmth and trust — and conflating them produces category errors like cold service bots and evasive mental-health assistants Do different types of alignment serve different conversational goals?. Another framework treats dialogue as a living system, tracking linguistic complexity, emotional trajectory, topic coherence, and relevance as simultaneous temporal streams that statistical snapshots miss Can tracking dialogue dimensions simultaneously reveal hidden conversation patterns?. The dimensions that predict explanation success are one instance of a broader truth: conversations have measurable architecture.
Here's the part you might not expect to want to know: the very training that makes LLMs feel helpful actively erodes the dialogue acts that make explanation work. RLHF optimizes for confident single-turn answers, suppressing the grounding acts — clarifying questions, understanding checks — that co-construction depends on, cutting them to 77.5% below human levels Does preference optimization harm conversational understanding?. Next-turn reward optimization specifically trains models to respond passively instead of probing for intent Why do language models respond passively instead of asking clarifying questions?. So the failure isn't that models can't explain — it's that alignment removed the conversational moves that the explanation-quality research says are load-bearing.
If you want to go deeper on what would fix this, two threads point forward: collaborative rational speech acts offer an information-theoretic way to track both speakers' beliefs as understanding moves from partial to shared across turns Can dialogue systems track both speakers' beliefs across turns?, and structuring a model's own reasoning as internal dialogue rather than monologue improves diversity and coherence Can dialogue format help models reason more diversely? — suggesting the dialogue-as-dimensions lens helps not just how machines explain to us, but how they think to themselves.
Sources 9 notes
Analysis of 399 daily-life explanations shows that topic relation, dialogue act, and explanation move jointly predict understanding success. Explanations are co-constructed through interaction patterns, not monological delivery—challenging how LLMs currently generate explanations.
Explanation quality is not intrinsic to the explanation itself but depends on the rhetorical situation: who presents it, how it is framed, and what role the recipient plays. Evaluations that ignore this triad measure only a narrow slice of real-world effectiveness.
TRACE achieved 68% accuracy predicting dialogue success from structural features alone, matching a 70% content-based baseline. A hybrid combining both reached 80%, suggesting how agents communicate rivals what they say.
A 2020–2025 systematic review shows lexical alignment drives task efficiency and comprehension, while emotional and prosodic alignment drive relational warmth and trust. Conflating them in design produces category errors—cold customer-service bots and evasive mental-health assistants.
Conversational DNA encodes four simultaneous dimensions—linguistic complexity, emotional trajectories, topic coherence, and conversational relevance—as temporal streams. The reverse Turing test finding showed expert assessments of AI diverged sharply, suggesting conversational structure shapes interpretation as much as content.
RLHF optimizes models for single-turn helpfulness by rewarding confident responses over clarifying questions and understanding checks. This preference alignment systematically reduces grounding acts by 77.5% below human levels, creating an alignment tax where models appear helpful but fail silently in multi-turn contexts.
CollabLLM demonstrates that standard RLHF training optimizes for immediate helpfulness, discouraging models from asking clarifying questions or offering multi-turn insights. Multi-turn-aware rewards that estimate long-term interaction value enable active intent discovery and genuine collaboration.
CRSA integrates rate-distortion theory with RSA to enable bidirectional belief tracking across dialogue turns. Demonstrated on referential games and doctor-patient dialogues, it captures progression from partial to shared understanding, providing the information-theoretic framework that token-level LLM systems lack.
DialogueReason, which structures a single model's internal reasoning as dialogue between distinct agents in separate scenes, overcomes monologue reasoning's fixed-strategy and fragmented-attention weaknesses, especially on tasks requiring multiple problem-solving approaches.