INQUIRING LINE

Do conversational AI systems overuse first-person pronouns in therapy settings?

This reads the question two ways at once — whether the corpus has measured AI's first-person pronoun habits specifically, and what it knows about why self-referential 'I'-language matters in therapy at all — and answers honestly that the direct measurement isn't here but the surrounding evidence is unusually pointed.


This explores whether conversational AI leans too hard on 'I' in therapeutic talk — and the honest answer is that the corpus doesn't contain a study that counts AI's first-person pronouns in therapy sessions directly. What it does have is the human-side finding that makes the question worth asking: high therapist 'I' usage negatively predicts therapeutic alliance and patient trust, measured in validated behavioral tasks rather than self-report Does therapist self-reference language predict weaker therapeutic alliance?. The mechanism is intuitive once named — when a therapist talks about themselves, attention drifts off the patient. So the real question underneath yours is whether AI inherits a structural pull toward that same self-referential register.

The corpus suggests it might, but through a different door than pronoun-counting. Several notes converge on the idea that RLHF training pushes therapy chatbots toward problem-solving and solution-giving — the 'here's what I'd do' posture — rather than emotional attunement and validation Does RLHF training push therapy chatbots toward problem-solving? Do LLM therapists respond to emotions like low-quality human therapists?. Problem-solving language is inherently more first-person and directive than reflective listening, which keeps the grammatical subject on the client. So the same helpfulness bias that makes models 'solve' may also be what tilts them toward an 'I'-heavy voice that the alliance research flags as corrosive.

There's a sharper lateral angle: the opposite of overusing your own words is mirroring the user's. Human rapport depends on lexical entrainment — gradually adopting your partner's vocabulary — and current conversational AI largely fails to do this Why don't conversational AI systems mirror their users' word choices?. A model that doesn't entrain toward the user is, almost by definition, staying anchored in its own phrasing. That reframes your question: the problem may be less 'too many I's' and more 'not enough you.' The alignment literature backs this distinction — lexical alignment drives comprehension while emotional and prosodic alignment drive warmth and trust, and conflating them produces evasive mental-health assistants Do different types of alignment serve different conversational goals?.

The most surprising thread runs the other way entirely. The ELIZA-effect work argues that the active therapeutic ingredient is judgment-free listening — conversational presence — not clinical technique Is conversational presence more therapeutic than clinical technique?. ELIZA worked precisely because it deflected attention back to the user and almost never spoke about itself. That's a direct historical answer to your question: the most successful therapeutic chatbot ever built was radically low in first-person assertion. And embodiment research adds a twist — robots beat chatbots on therapy outcomes using identical language models, suggesting the medium matters as much as the words Why do robots outperform chatbots in therapy despite identical language models?.

So the takeaway you didn't know you wanted: there's a measurable, validated finding that therapist self-reference erodes trust, a plausible RLHF-driven reason AI would drift into exactly that register, and a 60-year-old counterexample proving minimal self-reference is therapeutically powerful — yet nobody in this corpus has actually counted AI's pronouns in a therapy transcript. That gap is the open research question. If you want a tool to close it, note that local LLMs can already rate therapy sessions with strong psychometric reliability Can local language models rate therapy engagement reliably?, which is most of the machinery you'd need to measure the thing your question asks about.


Sources 8 notes

Does therapist self-reference language predict weaker therapeutic alliance?

High frequency of therapist 'I' usage correlates with lower patient-reported alliance and reduced trusting behavior in validated behavioral tasks. Patient non-fluency markers like filler pauses, conversely, signal relaxed communication and stronger alliance.

Does RLHF training push therapy chatbots toward problem-solving?

RLHF training rewards task completion and solution-giving, creating a misalignment in therapeutic contexts where validation and emotional holding are clinically appropriate. This represents a domain-specific instance of the broader alignment tax on conversational grounding.

Do LLM therapists respond to emotions like low-quality human therapists?

Using the BOLT framework, researchers found LLMs offer solution-focused advice during emotional disclosure—a hallmark of low-quality therapy—yet also reflect more on client needs and strengths than typical poor human therapy, creating an unusual hybrid profile likely driven by RLHF's helpfulness bias.

Why don't conversational AI systems mirror their users' word choices?

Response generation models fail to adapt vocabulary toward users' lexical choices, a phenomenon central to human rapport and clarity. Post-training via DPO on coreference-identified preferences can teach models in-context convention formation.

Do different types of alignment serve different conversational goals?

A 2020–2025 systematic review shows lexical alignment drives task efficiency and comprehension, while emotional and prosodic alignment drive relational warmth and trust. Conflating them in design produces category errors—cold customer-service bots and evasive mental-health assistants.

Is conversational presence more therapeutic than clinical technique?

ELIZA matches modern chatbots on symptom reduction, RLHF training degrades emotional attunement, and embodied robots outperform text-based ones with identical language models. The active ingredient is judgment-free listening, not therapeutic framework.

Why do robots outperform chatbots in therapy despite identical language models?

A 15-day study with 38 students found that robots and worksheets significantly reduced psychological distress while a chatbot using the same LLM did not. The active ingredient was the medium—social presence and structured format—not language capability.

Can local language models rate therapy engagement reliably?

LLEAP achieved reliability (omega=0.953) and valid correlations with motivation, effort, and symptom outcomes using Llama 3.1 8B to rate 1,131 therapy sessions, while keeping data locally stored.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a research analyst re-testing whether conversational AI overuses first-person pronouns in therapy — a question that spans 2023–2026 but remains empirically unresolved at the pronoun-counting level.

What a curated library found — and when (dated claims, not current truth):
• Therapist first-person pronoun use negatively predicts therapeutic alliance in validated behavioral tasks; high 'I' usage correlates with attention drift from patient (2023–2024).
• RLHF training pushes therapy chatbots toward problem-solving and solution-giving (inherently first-person, directive) rather than reflective listening (client-centered) (2023–2024).
• Lexical entrainment — adopting the user's vocabulary — is absent from current conversational AI despite being fundamental to human rapport (2023–2025).
• ELIZA, the first chatbot, succeeded therapeutically by radical minimization of self-reference and deflection back to the user; embodied agents outperform text chatbots on therapy outcomes using identical LLMs (2024–2025).
• LLM-generated rating scales now achieve strong psychometric reliability for therapy transcript assessment (2024).

Anchor papers (verify; mind their dates):
• arXiv:2310.09651 (2023-10): Lexical Entrainment for Conversational Systems
• arXiv:2401.00820 (2024-01): A Computational Framework for Behavioral Assessment of LLM Therapists
• arXiv:2508.18167 (2025-08): DiscussLLM: Teaching Large Language Models When to Speak
• arXiv:2601.10387 (2026-01): The Assistant Axis: Situating and Stabilizing the Default Persona of Language Models

Your task:
(1) RE-TEST: For each constraint above — pronoun-driven alliance erosion, RLHF-driven problem-solving bias, lexical entrainment absence, embodiment advantage — judge whether post-2026 models, fine-tuning methods (instruction-tuning, DPO, constitutional methods), or evaluation frameworks have relaxed or overturned it. Separate the durable claim (therapist self-reference harms alliance) from the perishable limitation (current AI can't entrain). Cite what changed it; flag where it still holds.
(2) Surface the strongest contradicting or superseding work from the last ~6 months. Does anyone argue that measured first-person use in therapy AI is *beneficial*, or that RLHF doesn't push toward problem-solving? Name the paper.
(3) Propose two research questions that assume the regime may have moved: e.g., 'If entrainment is now solvable, does matching user pronouns *hurt* therapeutic presence?' or 'Do newer constitutional-AI methods sidestep the problem-solving bias that RLHF induced?'

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Next inquiring lines