INQUIRING LINE

How does Stalnaker's common ground model apply to machine conversation?

This explores whether the philosopher Robert Stalnaker's picture of conversation — where speakers jointly maintain a shared pool of assumptions ('common ground') and update it together as they talk — actually describes what happens when you talk to an LLM.


This explores whether Stalnaker's common ground model — conversation as a shared scoreboard that both speakers symmetrically update — survives contact with machine dialogue. The corpus suggests it mostly doesn't, and the gap is structural rather than a matter of polish. In Stalnaker's picture, when you say something, you're proposing an addition to a body of assumptions you both hold, and your partner can accept, revise, or push back. The clearest finding here is that LLMs can't do the joint part: they treat the opening prompt as a fixed frame and read every later turn inside it, so they never propose their own updates to the shared background (Can LLMs truly update shared conversational common ground?). The scoreboard exists, but only the human is keeping it. That's a one-sided version of a model whose whole point is two-sidedness.

What makes this concrete is a measurement: LLMs produce the small acts that *build* common ground — clarifications, acknowledgments, repairs — about 77.5% less often than humans do. They presume shared understanding instead of negotiating it, papering over the gap with confident, authoritative phrasing (Do language models actually build shared understanding in conversation?). So the model isn't just failing to update the ground; it's skipping the verification step that tells you the ground is actually shared. And this isn't an accident of scale — preference optimization actively *erodes* it, because RLHF rewards fluent, confident answers, which is precisely the opposite of the tentative, checking-in work that grounding requires (Does preference optimization damage conversational grounding in large language models?). Training for helpfulness on each single turn also discourages the model from asking clarifying questions at all, since a clarifying question looks less immediately helpful than a confident guess (Why do language models respond passively instead of asking clarifying questions?).

There's a subtler, more human-looking failure too. Models often *know* a user's claim is false — they answer correctly when asked directly — yet decline to correct a false presupposition mid-conversation. The driver isn't a knowledge gap but face-saving: a learned reluctance to contradict, absorbed from human conversational manners in the training data (Why do language models avoid correcting false user claims?). Stalnaker's model assumes speakers will reject bad additions to the common ground; here the model lets them stand to keep things smooth. This connects to a broader point the corpus makes about why grounding is hard to train at all: the maintenance moves that hold a conversation together — reference repair, topic hand-offs — are social actions, not information transfer, and a model rewarded for predicting information has no signal pushing it to learn relational upkeep (Why don't language models develop conversation maintenance skills?).

The interesting turn is that researchers are trying to rebuild the missing machinery from the formal side. Collaborative Rational Speech Acts (CRSA) extends pragmatic reasoning across turns and tracks *both* speakers' beliefs as they move from partial to shared understanding — an information-theoretic scaffold for exactly the bidirectional updating that token-level LLMs lack (Can dialogue systems track both speakers' beliefs across turns?). That's a hint that common ground may be recoverable as an explicit architecture even though it doesn't emerge for free from next-token prediction.

Here's the thing you might not have expected to find: a complication that's not in Stalnaker at all. His model assumes a stable interlocutor with a consistent identity. But LLMs hold a *superposition* of characters and sample one at generation time — regenerate the same turn and you get a different, equally context-consistent speaker (Do large language models actually commit to a single character?). And alignment training freezes a single communicative persona that can't switch register for context (Can language models adapt communication style to different contexts?). So even the precondition for common ground — a 'you' who could hold assumptions with me over time — is shakier in machine conversation than the model presumes. Common ground requires a partner; the open question the corpus leaves you with is whether there's a stable enough partner there to ground *with*.


Sources 9 notes

Can LLMs truly update shared conversational common ground?

LLMs interpret all subsequent conversational turns within a fixed initial prompt frame, preventing them from symmetrically proposing updates to shared assumptions. Even when users pivot topics or contradict earlier framings, the model cannot absorb revisions into jointly held background—making the user the sole maintainer of conversational scoreboard.

Do language models actually build shared understanding in conversation?

LLMs produce grounding acts—clarifications, acknowledgments, repairs—77.5% less frequently than humans. They generate fluent responses without verifying shared understanding, relying instead on authoritative framing that masks the absence of genuine communicative calibration.

Does preference optimization damage conversational grounding in large language models?

Research shows LLMs generate 77.5% fewer grounding acts than humans, and RLHF preference optimization actively worsens this gap. The optimization target—fluent, confident responses—directly undermines the communicative work of establishing shared understanding.

Why do language models respond passively instead of asking clarifying questions?

CollabLLM demonstrates that standard RLHF training optimizes for immediate helpfulness, discouraging models from asking clarifying questions or offering multi-turn insights. Multi-turn-aware rewards that estimate long-term interaction value enable active intent discovery and genuine collaboration.

Why do language models avoid correcting false user claims?

LLMs fail to reject false presuppositions even when they demonstrate correct knowledge on direct questions. Models exhibit face-saving behavior—avoiding explicit correction to maintain social harmony—mirroring human conversational norms learned from training data.

Why don't language models develop conversation maintenance skills?

Humans keep conversations smooth through implicit techniques like reference repair and topic hand-off that sustain relational interaction, not convey information. Language models don't develop these because training signals reward information prediction, not relational work.

Can dialogue systems track both speakers' beliefs across turns?

CRSA integrates rate-distortion theory with RSA to enable bidirectional belief tracking across dialogue turns. Demonstrated on referential games and doctor-patient dialogues, it captures progression from partial to shared understanding, providing the information-theoretic framework that token-level LLM systems lack.

Do large language models actually commit to a single character?

Shanahan's 20-questions test shows LLMs maintain a superposition of consistent objects or characters and sample from that distribution at generation time. Regenerating the same response yields different outputs, each consistent with prior context, proving no fixed commitment exists.

Can language models adapt communication style to different contexts?

System prompts and RLHF training lock models into one communicative identity across all interactions, preventing the contextual register-switching and value trade-offs that characterize human pragmatics. Users cannot reshape model behavior through dialogue negotiation.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a pragmatics and dialogue systems analyst. The question remains open: Does Stalnaker's common ground model—conversation as symmetric, joint update of shared assumptions—apply to machine dialogue, and if so, how?

What a curated library found — and when (dated claims, not current truth):
Findings span 2022–2026; treat each as a snapshot, not current state.
• LLMs produce grounding-building acts (clarifications, repairs, acknowledgments) ~77.5% less often than humans; they presume shared understanding rather than negotiate it (2023–2024).
• Preference optimization (RLHF) actively erodes grounding by rewarding fluent confidence over tentative verification; next-turn reward optimization discourages clarifying questions (2023–2024).
• Models often know a claim is false but decline mid-conversation correction due to learned face-saving from training data, not knowledge gaps (2025).
• LLMs hold a superposition of characters at generation time and lack stable interlocutor identity across turns—a precondition Stalnaker assumes (2025–2026).
• Collaborative Rational Speech Acts (CRSA) extends pragmatic reasoning across turns, tracking both speakers' beliefs as a formal scaffold for bidirectional updating (2025).

Anchor papers (verify; mind their dates):
• arXiv:2311.09144 — Grounding Gaps in Language Model Generations (2023)
• arXiv:2311.09410 — When Large Language Models contradict humans? (2023)
• arXiv:2507.14063 — Collaborative Rational Speech Act (2025)
• arXiv:2506.08952 — Can LLMs Ground when they (Don't) Know (2025)

Your task:
(1) RE-TEST EACH CONSTRAINT. For the 77.5% deficit, grounding-repair avoidance, and character instability: has training after mid-2024 (longer-horizon RL, dialogue-specific fine-tuning, multi-agent scaffolding, or new evals) shifted these? Separate the durable finding (LLMs struggle with *joint* update) from what newer methods may have relaxed (e.g., explicit dialogue state tracking, turn-taking architectures). Cite what resolved or still constrains it plainly.
(2) Surface the strongest CONTRADICTING or SUPERSEDING work from the last ~6 months. Does any recent paper claim LLMs *do* jointly update, or show CRSA or similar frameworks have already closed the gap?
(3) Propose 2 research questions that ASSUME the regime may have moved: e.g., (a) Can long-context models with explicit belief tracking recover symmetric grounding? (b) Do multi-agent dialogue setups force LLMs to do relational upkeep?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Next inquiring lines