How do users update their partner models during ongoing conversation?
This explores how people revise their mental picture of an AI dialogue partner — what it's good at, how human-like it seems, how flexibly it adapts — as a conversation unfolds, and where the corpus shows that updating breaks down.
This explores how users revise their working model of a conversational partner mid-dialogue — their sense of what it can do, how human-like it is, and how well it bends to them. The starting point is that users don't hold one vague impression; they track three separate dials. Work on the Partner Modelling Questionnaire found that perceived competence dominates the picture (about half the variance), followed by human-likeness and then communicative flexibility How do users mentally model dialogue agent partners?. So 'updating your partner model' really means re-weighting these dials as evidence arrives — and competence is the one users watch hardest.
The quietly unsettling finding across the corpus is that this updating is mostly one-directional. In human talk, both speakers keep editing a shared scoreboard of assumptions. But LLMs treat the opening prompt as a fixed frame and interpret every later turn inside it, so they can't symmetrically propose revisions to common ground — which leaves the user as the sole maintainer of the shared picture Can LLMs truly update shared conversational common ground?. Pragmatic theory shows what genuine two-way tracking would look like: collaborative rational speech acts model both parties moving from partial to shared understanding across turns, the bidirectional belief-updating that token-level systems lack Can dialogue systems track both speakers' beliefs across turns?. Practically, the human does almost all the modeling work, and the machine does almost none.
That matters because the partner keeps shifting under the user's feet. Models drift along a dominant 'distance from default Assistant' axis during emotional or self-reflective exchanges How stable is the trained Assistant personality in language models?, and they degrade over long conversations — not from losing capability but from misreading user intent, because training rewards committing to an early answer over asking for clarification Why do language models lose performance in longer conversations?. So a user's competence estimate, formed in the first few turns, can quietly go stale as the very thing being modeled changes.
The corpus also hints that users update through signals the machine fails to send back. Humans sustain conversations with implicit maintenance moves — repairing references, handing off topics — that are relational rather than informational, and LLMs don't develop them because training rewards predicting information, not doing social work Why don't language models develop conversation maintenance skills?. They also don't mirror a user's vocabulary, the lexical entrainment that builds rapport in human dialogue Why don't conversational AI systems mirror their users' word choices?. Each missing cue is a missing piece of evidence for the user's flexibility dial. And one intriguing twist: a lot of partner-model updating may be measurable from conversation shape alone — a structure-only model predicted user satisfaction nearly as well as full-text analysis Can conversation shape predict whether it will work?, suggesting the trajectory of turns, not just their content, carries the signal users update on.
The deeper thread worth pulling: if you want a system that updates its model of you the way you update yours of it, the corpus points to making the persona itself a live intermediary — PersonaAgent revises a structured persona at test time by simulating recent interactions against feedback Can personas evolve in real time to match what users actually want? — and to measuring the relationship at turn-level resolution, the way COMPASS scores therapeutic alliance turn by turn and watches it converge or diverge over a session Can we measure therapist-patient alliance from dialogue turns in real time?. The surprise for a curious reader is that 'updating your partner model' isn't a soft, fuzzy act — it's a three-dial estimation problem the user is currently running almost entirely alone.
Sources 10 notes
The Partner Modelling Questionnaire reveals that perceived competence dominates user impressions (49% of variance), followed by human-likeness (32%) and communicative flexibility (19%). This three-factor structure reflects how people evaluate dialogue partners against both functional and social standards.
LLMs interpret all subsequent conversational turns within a fixed initial prompt frame, preventing them from symmetrically proposing updates to shared assumptions. Even when users pivot topics or contradict earlier framings, the model cannot absorb revisions into jointly held background—making the user the sole maintainer of conversational scoreboard.
CRSA integrates rate-distortion theory with RSA to enable bidirectional belief tracking across dialogue turns. Demonstrated on referential games and doctor-patient dialogues, it captures progression from partial to shared understanding, providing the information-theoretic framework that token-level LLM systems lack.
Research mapping hundreds of character archetypes reveals a low-dimensional persona space where the leading component measures distance from the default Assistant. Emotional and meta-reflective conversations cause predictable drift, but activation capping along this axis mitigates harmful shifts without degrading capabilities.
LLMs degrade in multi-turn settings because RLHF training rewards premature answers over clarification-seeking, creating pragmatic mismatch with individual user behaviors. A Mediator-Assistant architecture that explicitly parses user intent before execution recovers lost performance without retraining.
Humans keep conversations smooth through implicit techniques like reference repair and topic hand-off that sustain relational interaction, not convey information. Language models don't develop these because training signals reward information prediction, not relational work.
Response generation models fail to adapt vocabulary toward users' lexical choices, a phenomenon central to human rapport and clarity. Post-training via DPO on coreference-identified preferences can teach models in-context convention formation.
A structure-only model analyzing conversation trajectory achieved 68% accuracy predicting satisfaction, nearly matching full-text LLM analysis at 70%. Combined structural and textual features reached 80%, showing that how conversations unfold geometrically captures interaction quality text-based classifiers miss.
PersonaAgent uses structured personas to bridge episodic/semantic memory and personalized actions, optimizing them at test time by simulating recent interactions against textual feedback. Learned personas cluster meaningfully in latent space, suggesting genuine user-specific separation beyond standard post-training drift.
COMPASS maps dialogue turns onto WAI embeddings to produce 36-dimensional alliance scores per turn. Anxiety and depression show convergence in alliance metrics over time, while suicidality shows persistent misalignment between patient and therapist.