Can statistical token processing create the accountability needed for dialogue?
This explores whether next-token prediction — the smooth, probabilistic guts of an LLM — can on its own produce the accountability dialogue demands: committing to a position, tracking who believes what, and being answerable across turns.
This explores whether statistical token processing can, by itself, generate the accountability real dialogue requires — and the corpus's strongest signal is that it can't, not without help. Two findings cut to the core of the problem. Shanahan's 20-questions regeneration test shows that an LLM never actually commits to a single character or claim; it holds a superposition of consistent possibilities and samples one at generation time, so re-running the same prompt yields a different but equally-plausible answer Do large language models actually commit to a single character?. And generation itself is described as a smooth probabilistic flow that continues toward the training distribution rather than weighing competing claims — it produces fluent assertions that multiply without ever genuinely arguing against themselves Does LLM generation explore competing claims while producing text?. Accountability means you can be pinned to a position and held to it; sampling-without-commitment is almost the opposite.
What's striking is that the corpus treats accountability not as something that emerges from more or better statistics, but as a scaffold you have to graft on from outside the prediction objective. The clearest statement comes from collaborative rational speech acts (CRSA), which bolt rate-distortion theory onto pragmatic reasoning to track both speakers' beliefs as they move from partial to shared understanding — explicitly framed as supplying "the information-theoretic framework that token-level LLM systems lack" Can dialogue systems track both speakers' beliefs across turns?. In the same family, giving an agent an imaginary listener lets it check whether its own utterance would actually distinguish its persona from a distractor, suppressing generic or self-contradicting replies at inference time Can imaginary listeners reduce dialogue agent contradictions?. These are accountability mechanisms — answerability to a tracked other — layered on top of the token machinery, not produced by it.
The other route the corpus shows is changing what the statistics are rewarded for. Standard next-turn RLHF optimizes immediate helpfulness, which quietly trains models to be passive — to answer rather than ask, even when intent is unclear; multi-turn-aware rewards that estimate long-term interaction value flip this into active intent discovery and clarifying questions Why do language models respond passively instead of asking clarifying questions?. Persona drift gets attacked the same way: inverting RL to train for prompt-to-line, line-to-line, and Q&A consistency cuts contradiction by over half Can training user simulators reduce persona drift in dialogue?. Older spoken-dialogue work made the underlying move decades ago — because speech recognition is 15-30% wrong, you can't commit to one interpretation, so POMDP systems maintain a belief distribution over what the user meant rather than guessing Why do dialogue systems need probabilistic reasoning?. Probability there is the route to accountability precisely because it tracks uncertainty honestly instead of papering over it.
The lateral surprise: the very smoothness that makes token prediction fluent is what makes it unaccountable, and almost every fix in this collection is a way of forcing the model to be answerable to something it would otherwise glide past — a tracked listener, a tracked belief state, a delayed reward, or a structured commitment. Conversation-analytic work on insert-expansions formalizes this as knowing when to stop generating and probe the user instead of silently chaining tools toward a wrong target When should AI agents ask users instead of just searching?. So the honest answer to the question is: statistical token processing supplies fluency and a usable representation of uncertainty, but the accountability dialogue needs is an architectural addition — belief tracking, pragmatic self-models, and reward structures — without which the model will keep sampling plausible answers it was never actually committed to.
Sources 8 notes
Shanahan's 20-questions test shows LLMs maintain a superposition of consistent objects or characters and sample from that distribution at generation time. Regenerating the same response yields different outputs, each consistent with prior context, proving no fixed commitment exists.
Token prediction trains models to continue toward the training distribution, not to explore logically related counterpositions. This smoothness in process produces smooth claims that multiply without generating new perspectives.
CRSA integrates rate-distortion theory with RSA to enable bidirectional belief tracking across dialogue turns. Demonstrated on referential games and doctor-patient dialogues, it captures progression from partial to shared understanding, providing the information-theoretic framework that token-level LLM systems lack.
Endowing dialogue agents with an imaginary listener via Rational Speech Acts reduces persona contradiction at inference time without NLI labels or extra training. The agent simulates whether utterances would distinguish its persona from a distractor, suppressing generic or contradictory responses.
CollabLLM demonstrates that standard RLHF training optimizes for immediate helpfulness, discouraging models from asking clarifying questions or offering multi-turn insights. Multi-turn-aware rewards that estimate long-term interaction value enable active intent discovery and genuine collaboration.
By inverting standard RL setups to train user simulators for consistency using three complementary metrics (prompt-to-line, line-to-line, Q&A consistency) as reward signals, persona drift decreases by over 55%. This approach captures distinct failure types: local drift within turns, global drift across conversations, and factual contradictions.
Real-world speech recognition achieves 15-30 percent error rates in noisy environments, making deterministic flowchart dialogue systems unworkable. POMDP-based systems handle this by maintaining belief distributions over user intent rather than committing to single interpretations.
Tool-enabled LLMs drift from user intent through silent tool chaining. Conversation analysis reveals insert-expansions—clarifying intent, scoping responses, enhancing appeal—as a formal framework for proactive user consultation that prevents misunderstanding instead of recovering from it.