INQUIRING LINE

Inquiring lines›What makes reasoning better — more…›What limits conversational AI effe…›What makes dialogue-based explanat…›this inquiring line

Saying 'I' more often actually undermines collaborative tone — real trust shows up in filler pauses, not polished self-expression.

What role do first-person pronouns play in sustaining collaborative conversation tone?

This explores whether and how the words speakers use to refer to themselves — 'I,' 'me,' 'my' — shape the warmth and collaborative feel of a conversation, and what the corpus says runs counter to the intuition that self-disclosure builds rapport.

This explores the role of self-referential language in keeping a conversation feeling collaborative — and the corpus offers a sharp, counterintuitive answer: more first-person pronouns can work against the very tone they seem designed to build. The most direct evidence comes from therapy research, where a therapist's frequent 'I' usage actually *negatively* predicts the therapeutic alliance and reduces patient trust in validated behavioral tasks Does therapist self-reference language predict weaker therapeutic alliance?. The same note flags an almost backwards finding: it's the patient's *disfluencies* — filler pauses, non-fluency markers — that signal a relaxed, trusting relationship. In other words, collaborative tone isn't carried by polished self-reference; it's carried by signals that the floor is shared and the speaker feels safe enough to be messy.

That reframes the question. If first-person pronouns aren't doing the relational work, what is? The corpus locates collaborative tone in *implicit maintenance moves* rather than word choice — reference repair, topic hand-offs, and the small acts that sustain a relationship rather than transmit information Why don't language models develop conversation maintenance skills?. These are 'language as social action,' and they're precisely what models trained to predict information don't develop. A pronoun is a surface feature; grounding is an act. The deeper machinery is *calibrating shared reference* — negotiating how words map to the world for both speakers, not just emitting fluent first-person prose Why do speakers need to actively calibrate shared reference?.

There's a useful distinction lurking here that the corpus makes explicit: not all alignment between speakers serves the same goal. Lexical alignment (matching each other's words) drives task efficiency and comprehension, while emotional and prosodic alignment drive warmth and trust Do different types of alignment serve different conversational goals?. First-person pronoun frequency is a lexical-surface variable, but collaborative *tone* lives on the emotional/relational axis — which is why counting 'I's tells you little about whether a conversation feels like a partnership.

This connects to a structural limit in how LLMs hold up their side. Collaborative conversation requires *jointly* updating common ground, with both parties able to propose revisions to shared assumptions — but LLMs treat the opening prompt as a fixed frame and leave the user as the sole keeper of the conversational scoreboard Can LLMs truly update shared conversational common ground?. And preference optimization makes it worse: RLHF rewards confident single-turn answers over clarifying questions and understanding checks, cutting grounding acts to roughly a fifth of human levels Does preference optimization harm conversational understanding?. A model can deploy warm, self-referential 'I think' phrasing while doing none of the bidirectional belief-tracking that actually sustains collaboration — the kind of two-way reasoning frameworks like collaborative rational speech acts try to formalize Can dialogue systems track both speakers' beliefs across turns?.

The thing you didn't know you wanted to know: collaborative tone is mostly *not* in the pronouns. Restraint in self-reference, tolerance for the other speaker's imperfections, and the willingness to let common ground be jointly rewritten do the work — and these are exactly the moves current training objectives systematically erode.

Sources 7 notes

Does therapist self-reference language predict weaker therapeutic alliance?

High frequency of therapist 'I' usage correlates with lower patient-reported alliance and reduced trusting behavior in validated behavioral tasks. Patient non-fluency markers like filler pauses, conversely, signal relaxed communication and stronger alliance.

Why don't language models develop conversation maintenance skills?

Humans keep conversations smooth through implicit techniques like reference repair and topic hand-off that sustain relational interaction, not convey information. Language models don't develop these because training signals reward information prediction, not relational work.

Why do speakers need to actively calibrate shared reference?

The same words can mean different things to different speakers because referential grounding is person-specific. True communicative grounding demands collaborative negotiation of how language connects to the world, not mere surface-level word sharing.

Do different types of alignment serve different conversational goals?

A 2020–2025 systematic review shows lexical alignment drives task efficiency and comprehension, while emotional and prosodic alignment drive relational warmth and trust. Conflating them in design produces category errors—cold customer-service bots and evasive mental-health assistants.

Can LLMs truly update shared conversational common ground?

LLMs interpret all subsequent conversational turns within a fixed initial prompt frame, preventing them from symmetrically proposing updates to shared assumptions. Even when users pivot topics or contradict earlier framings, the model cannot absorb revisions into jointly held background—making the user the sole maintainer of conversational scoreboard.

Show all 7 sources

Does preference optimization harm conversational understanding?

RLHF optimizes models for single-turn helpfulness by rewarding confident responses over clarifying questions and understanding checks. This preference alignment systematically reduces grounding acts by 77.5% below human levels, creating an alignment tax where models appear helpful but fail silently in multi-turn contexts.

Can dialogue systems track both speakers' beliefs across turns?

CRSA integrates rate-distortion theory with RSA to enable bidirectional belief tracking across dialogue turns. Demonstrated on referential games and doctor-patient dialogues, it captures progression from partial to shared understanding, providing the information-theoretic framework that token-level LLM systems lack.

Papers this line draws on 8

The research behind the notes this line reads — ranked by how closely each paper relates.

Intent Mismatch Causes LLMs to Get Lost in Multi-Turn Conversation3.38 match · arxiv ↗
Conversational Alignment with Artificial Intelligence in Context3.36 match · arxiv ↗
The Goldilocks of Pragmatic Understanding: Fine-Tuning Strategy Matters for Implicature Resolution by LLMs2.47 match · arxiv ↗
Grounding Gaps in Language Model Generations2.47 match · arxiv ↗
LLMs Get Lost In Multi-Turn Conversation1.64 match · arxiv ↗
Can LLMs Ground when they (Don't) Know: A Study on Direct and Loaded Political Questions1.63 match · arxiv ↗
The Vector Grounding Problem0.93 match · arxiv ↗
Collaborative Rational Speech Act: Pragmatic Reasoning for Multi-Turn Dialog0.92 match · arxiv ↗

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a conversational AI researcher re-testing the claim that first-person pronouns sustain collaborative tone. The question remains open: what *actually* builds and maintains the felt sense of partnership in dialogue?

What a curated library found — and when (dated claims, not current truth):
Findings span 2022–2026; treat these as perishable constraints to be re-tested:
• Therapists' frequent 'I' usage *negatively* predicts therapeutic alliance and patient trust; instead, patient disfluencies (filler pauses, non-fluency) signal safe, relaxed conversation (~2024, arXiv:2402.11958).
• Collaborative tone lives on emotional/prosodic axes, not lexical surface; lexical alignment improves task efficiency but emotional alignment drives warmth—first-person pronoun frequency is orthogonal to felt partnership (~2025, arXiv:2505.22907).
• LLMs cannot jointly update common ground with users because the opening prompt is fixed and training via RLHF rewards confident single-turn answers, eroding grounding acts to ~one-fifth of human frequency (~2025, arXiv:2507.14063, 2508.18167).
• Reference repair, topic hand-offs, and implicit maintenance moves—not word choice—sustain relational collaboration; these are 'language as social action,' precisely what information-prediction training doesn't develop (~2023–2024).

Anchor papers (verify; mind their dates):
• arXiv:2402.11958 (Feb 2024) — therapeutic alliance and first-person pronoun usage
• arXiv:2507.14063 (Jul 2025) — Collaborative Rational Speech Acts and pragmatic multi-turn reasoning
• arXiv:2508.18167 (Aug 2025) — DiscussLLM and when LLMs should speak vs. listen
• arXiv:2602.07338 (Feb 2026) — intent mismatch in multi-turn conversation

Your task:
(1) RE-TEST each constraint. For the 'I'-usage penalty, disfluency-trust link, and grounding-act suppression: have new training methods (e.g., multi-turn RL, dialogue-specific fine-tuning, latency-aware generation), new eval frameworks, or architectural changes (memory, turn-tagging, explicit grounding tokens) since relaxed or inverted these limits? Separate the durable question (what *is* collaborative tone?) from the perishable limitation (current models can't do it). Cite what changed it.
(2) Surface the strongest *disagreement* in the last 6 months: does any recent work claim that self-referential language, when paired with uncertainty markers or joint-belief-tracking phrasing, *does* restore grounding and collaboration? Flag the tension.
(3) Propose 2 new research questions assuming the regime has shifted: e.g., (a) Can explicit negotiation of shared reference—surfaced as LLM-voiced hypotheses about common ground—restore two-way updating? (b) Do models trained on dialogue with human-annotated grounding acts and repair moves recover the disfluency-trust signal without sacrificing fluency?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Saying 'I' more often actually undermines collaborative tone — real trust shows up in filler pauses, not polished self-expression.

Related lines of inquiry

Sources 7 notes

Papers this line draws on 8