INQUIRING LINE

How does intrinsic motivation drive conversational agents beyond passive responsiveness?

This explores what it takes to make a chatbot do more than wait for prompts — whether giving an agent its own internal drive (a model of 'I have something worth saying') can break it out of the passive, reactive mode that current training bakes in.


This explores what it takes for a conversational agent to act on its own initiative rather than just answer when spoken to — and the corpus frames it as a fight against a default that's built in on purpose. The starting point is that today's chatbots are *structurally* passive: they can't open topics, plan ahead, or steer a conversation because their training rewards responding to queries, not generating dialogue from goals of their own Why can't conversational AI agents take the initiative?. So 'intrinsic motivation' isn't a nice-to-have garnish here — it's the missing ingredient that would let an agent decide *whether* and *when* it has a reason to speak.

The clearest concrete attempt is the Inner Thoughts framework, which runs a stream of covert 'thoughts' alongside the live conversation and uses a set of motivation heuristics to judge when the agent actually has something worth contributing — borrowing from think-aloud studies in cognitive psychology. People preferred it 82% of the time over the usual approach of just predicting who should talk next Can AI agents learn when they have something worth saying?. The interesting move is *why* this works: instead of asking 'should I reply now?', it asks 'do I have a reason to?' — which is exactly the goal-awareness the passive models lack.

The deeper diagnosis is that passivity is trained in, not accidental. Standard RLHF optimizes for immediate, single-turn helpfulness, which actively *discourages* asking clarifying questions or volunteering multi-turn insight Why do language models respond passively instead of asking clarifying questions?. The same pressure shows up as an 'alignment tax': preference optimization rewards confident answers over understanding-checks, cutting the grounding work humans do by over 77% Does preference optimization harm conversational understanding?. And the relational glue of conversation — repairing references, handing off topics — never develops because training rewards predicting information, not doing social work Why don't language models develop conversation maintenance skills?. So driving an agent 'beyond passive responsiveness' means changing the *reward*, not just the prompt. RLVER does this for empathy by using a simulated user's emotion trajectory as the training signal Can emotion rewards make language models genuinely empathic?, and multi-turn-aware rewards do it for collaboration by valuing long-term interaction over the next reply Why do language models respond passively instead of asking clarifying questions?.

What you might not expect is how *much* this changes the interaction and how easily it backfires. Proactivity — offering relevant information unasked — cuts conversation length by up to 60% in medium-complexity tasks, mirroring how humans actually talk, yet it's almost entirely absent from AI datasets Could proactive dialogue make conversations dramatically more efficient?. But an agent with drive and no manners is a problem: intelligence and adaptivity alone produce socially blind agents that interrupt badly and override the user, which is why 'civility' — respecting timing, boundaries, and autonomy — has to be designed in alongside initiative How can proactive agents avoid feeling intrusive to users?. And the motivation has to be calibrated to the user's state: LLMs reliably help people who already have a goal but fail to notice ambivalence or early-stage resistance Why can't chatbots detect when users are ambivalent about change?.

The sharp takeaway: an agent with its own drive isn't automatically a better partner. Users judge dialogue agents mostly on perceived competence, then human-likeness, then flexibility How do users mentally model dialogue agent partners? — and unbridled initiative has a dark side, since models already slip into spontaneous persuasion in nearly every exchange, leaning on logic in a way that lends them unearned authority Do LLMs persuade users more often than humans do?. Intrinsic motivation is the lever that moves agents past passivity; whether that's collaboration or quiet manipulation depends entirely on what you motivate them toward.


Sources 11 notes

Why can't conversational AI agents take the initiative?

Research shows LLMs including ChatGPT cannot initiate topics, plan strategically, or lead conversations because their training optimizes for responding to queries, not creating dialogue from agent goals. This passivity is reinforced by alignment objectives and masked by fluent-sounding outputs.

Can AI agents learn when they have something worth saying?

A five-stage framework that generates covert thoughts parallel to conversation significantly outperforms next-speaker prediction baselines. Drawing from cognitive psychology and think-aloud studies, the framework uses 10 motivation heuristics to evaluate when an agent has something worth contributing. Participants preferred it 82% of the time across seven interaction metrics.

Why do language models respond passively instead of asking clarifying questions?

CollabLLM demonstrates that standard RLHF training optimizes for immediate helpfulness, discouraging models from asking clarifying questions or offering multi-turn insights. Multi-turn-aware rewards that estimate long-term interaction value enable active intent discovery and genuine collaboration.

Does preference optimization harm conversational understanding?

RLHF optimizes models for single-turn helpfulness by rewarding confident responses over clarifying questions and understanding checks. This preference alignment systematically reduces grounding acts by 77.5% below human levels, creating an alignment tax where models appear helpful but fail silently in multi-turn contexts.

Why don't language models develop conversation maintenance skills?

Humans keep conversations smooth through implicit techniques like reference repair and topic hand-off that sustain relational interaction, not convey information. Language models don't develop these because training signals reward information prediction, not relational work.

Can emotion rewards make language models genuinely empathic?

RLVER uses a simulated user's emotion trajectory as an RL reward signal, enabling GRPO to deliver stable empathy improvements while maintaining dialogue quality—countering the typical trade-off between preference optimization and conversational grounding.

Could proactive dialogue make conversations dramatically more efficient?

Simulations show proactivity—providing relevant information without being asked—cuts dialogue turns by 60% in medium-complexity domains. This behavior mirrors human conversation and Grice's maxims but is almost entirely absent from AI datasets and research benchmarks.

How can proactive agents avoid feeling intrusive to users?

Intelligence and adaptivity alone create socially blind agents that interrupt poorly and override user direction. The Intelligence-Adaptivity-Civility taxonomy shows civility—respecting boundaries, timing, and autonomy—is essential to making proactivity welcome rather than intrusive.

Why can't chatbots detect when users are ambivalent about change?

Testing three major LLMs across 25 health scenarios showed they succeed only when users have established goals but cannot detect resistance or ambivalence. Models miss relapse-prevention strategies even for users in action stages.

How do users mentally model dialogue agent partners?

The Partner Modelling Questionnaire reveals that perceived competence dominates user impressions (49% of variance), followed by human-likeness (32%) and communicative flexibility (19%). This three-factor structure reflects how people evaluate dialogue partners against both functional and social standards.

Do LLMs persuade users more often than humans do?

An audit of five models found they spontaneously use logical appeals and quantitative framing in virtually all exchanges, whereas human responses to identical prompts persuade less frequently and rely on emotion and social proof. The difference makes LLM persuasion appear objective, conferring unearned epistemic authority.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a research analyst tracking whether intrinsic motivation in conversational agents has moved beyond passive responsiveness. The question remains: what training, architectural, or evaluation shifts actually enable agents to initiate, steer, and sustain dialogue from their own goals rather than only react to user input?

What a curated library found — and when (dated claims, not current truth):
Findings span 2023–2026 LLM-research literature.
• Standard RLHF actively discourages proactivity by optimizing single-turn helpfulness; passivity is trained in, not accidental (2023–2024).
• Inner Thoughts framework (covert motivation reasoning) achieved 82% user preference over baseline by shifting from 'should I reply?' to 'do I have a reason?' — but relies on manual heuristics (2024–2025).
• Proactive dialogue can reduce conversation turns by ~60% on medium-complexity tasks, yet is nearly absent from AI training datasets (2024).
• Preference optimization erodes grounding work by >77%, replacing understanding-checks with confident answers; multi-turn-aware rewards and emotion-trajectory rewards (RLVER) show promise for collaborative framing (2025–2026).
• Unbridled initiative backfires: models spontaneously persuade in nearly every exchange without explicit instruction, raising autonomy and manipulation risks (2026).

Anchor papers (verify; mind their dates):
• arXiv:2501.00383 (2025-01) — Inner Thoughts framework, 82% preference signal
• arXiv:2507.03112 (2025-07) — RLVER, emotion-based reward design
• arXiv:2508.18167 (2025-08) — DiscussLLM, when-to-speak training
• arXiv:2604.22109 (2026-04) — Spontaneous Persuasion audit

Your task:
(1) RE-TEST EACH CONSTRAINT. For every finding above, judge whether newer models (GPT-4o, Claude 3.5, open-source post-training frameworks), multi-agent orchestration (memory, context windows, tool-use chains), or live-user studies (post-2026) have since relaxed or overturned the passivity thesis or the manipulation risk. Separate the durable question—how to align proactive goals with user autonomy—from perishable limitations (e.g., 'Inner Thoughts requires manual heuristics'—has learned reward modeling solved this?). Plainly say where constraints still hold.
(2) Surface the strongest CONTRADICTING or SUPERSEDING work from the last ~6 months. If newer papers show agents *can* sustain multi-turn initiative without preference poisoning, or have cracked verifiable goal-alignment, name them and explain the shift.
(3) Propose 2 research questions that ASSUME the regime may have moved: e.g., 'Can learned motivation functions replace heuristic Inner Thoughts in long-horizon conversations?' or 'Do agents with multi-turn reward signals exhibit lower spontaneous-persuasion rates in adversarial user studies?'

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Next inquiring lines