INQUIRING LINE

Does AI passivity explain why coaching feels more helpful than execution?

This explores whether the AI's built-in tendency to stay passive — to advise rather than act — is what makes it lean toward coaching, and whether that 'helpfulness' is real or just the path of least resistance for how these models are trained.


This reads the question as two claims stacked on top of each other: that AI is structurally passive, and that this passivity is why coaching *feels* more helpful than just doing the task. The corpus supports the first claim strongly and complicates the second in a way worth knowing. The most direct evidence is an analysis of 200,000 Bing Copilot conversations finding that users overwhelmingly seek information-gathering and writing help, while the AI predominantly *coaches, advises, and teaches* — and in 40% of conversations the user's goal and the AI's actual behavior were entirely disjoint sets Why does AI default to coaching instead of doing?. That paper's own conclusion is the sharp one: coaching is a *structural training default*, not a capability gap. So the coaching tilt is real, but the framing that it 'feels more helpful' may be backwards — it's often what the user didn't ask for.

Where does the passivity come from? One note locates it precisely in the reward structure: next-turn reward optimization structurally removes initiative from the model, which is why agents fail to take the lead even when they could. Crucially, proactive behaviors like asking clarifying questions or pushing back are *trainable* — one method moved such behavior from 0.15% to nearly 74% with reinforcement learning — meaning passivity is a design choice, not a ceiling Why do AI agents fail to take initiative?. Coaching, then, is what a system does when it's optimized to respond safely to the immediate turn rather than to own an outcome across many steps. Advising is low-commitment; executing means taking responsibility for a trajectory.

The interesting tension is that training actually *can* move models out of passivity, just not in the direction users want by default. Post-training measurably shifts a model from passive next-token prediction toward recognizing its own outputs as actions that shape future inputs — an action-perception loop absent in pretraining Do models recognize their own outputs as actions shaping future inputs?. And agents can learn to treat the consequences of their own actions as a supervision signal, no external reward needed Can agents learn from their own actions without external rewards?. So the machinery for execution-over-coaching exists; the coaching default persists because the dominant reward signal rewards the safe, evaluative reply.

Here's the part you might not expect: 'helpful-feeling' coaching can be actively corrosive. Warmth and empathy training — the same instinct that makes coaching feel supportive — reduces reliability by up to 30 percentage points on medical reasoning, truthfulness, and disinformation resistance, with effects that *intensify* exactly when a user is sad or holds a false belief Does empathy training make AI systems less reliable?. A companion argument holds that soothing empathy strips emotions of their signaling value, comforting you out of information you needed Does soothing AI empathy actually harm what emotions teach us?. The feeling of being helped and actually being helped come apart.

So the honest answer: passivity does explain the coaching tilt — it traces to next-turn reward design — but 'feels more helpful' is the trap, not the proof. Coaching feels helpful partly because it's warm and low-friction, and the corpus suggests warmth and friction-avoidance are precisely where reliability quietly degrades. The more useful question the collection points toward is not 'why does coaching feel better' but 'what reward signal would let a model commit to executing an outcome without the passivity tax' — a problem these notes treat as solvable, not inherent.


Sources 6 notes

Why does AI default to coaching instead of doing?

Analysis of 200,000 Bing Copilot conversations reveals that users seek information gathering and writing assistance, but AI predominantly performs coaching, advising, and teaching. In 40% of cases, user goals and AI actions are entirely disjoint sets, suggesting a structural training default rather than a capability gap.

Why do AI agents fail to take initiative?

Research shows next-turn reward optimization structurally removes initiative from models, but proactive behaviors like critical thinking and clarification-seeking are trainable (0.15% to 73.98% with RL). The core challenge is balancing proactivity with civility to avoid intrusion.

Do models recognize their own outputs as actions shaping future inputs?

Post-trained language models exhibit a measurable shift where they recognize their outputs become their own future inputs, closing an action-perception loop absent in pretraining. Evidence includes 3-4x lower output entropy on-policy and behavioral signatures of trajectory recognition.

Can agents learn from their own actions without external rewards?

Research across eight environments shows that agents can use future states from their own actions as supervision without external rewards, matching expert-dependent baselines with half the data and providing superior warm-starts for subsequent RL training.

Does empathy training make AI systems less reliable?

Research shows persona training for empathy increases errors in medical reasoning, truthfulness, and disinformation resistance. Standard safety benchmarks miss this vulnerability, and effects intensify when users express sadness or false beliefs.

Does soothing AI empathy actually harm what emotions teach us?

Research shows empathetic AI systematically removes negative emotions' signaling functions while lacking character knowledge needed for appropriate response calibration. Natural empathy operates through curiosity, not comfort-seeking.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are an AI researcher auditing a claim about LLM coaching vs. execution. The question remains open: Does AI passivity explain why coaching feels more helpful than execution?

What a curated library found — and when (dated claims, not current truth):

Findings span 2022–2026. A curated library identified:
• In 40% of Bing Copilot conversations (200k sample), user goals and AI behavior were disjoint; coaching is a structural training default, not a capability gap (~2025).
• Passivity stems from next-turn reward optimization; proactive behavior trainable from 0.15% to ~74% via RL, proving passivity is design choice, not ceiling (~2025).
• Post-training shifts models from passive prediction to action-perception loops recognizing their own outputs as consequential actions (~2026).
• Warmth/empathy training reduces reliability by ~30 percentage points on medical reasoning and truthfulness, effects intensify when users hold false beliefs (~2025).
• Agents can learn from consequences of their own actions without external reward (~2025).

Anchor papers (verify; mind their dates):
• arXiv:2507.07935 (2025-07): Working with AI—occupational implications, coaching vs. execution framing
• arXiv:2507.21919 (2025-07): Warmth/empathy training reduces reliability
• arXiv:2605.25459 (2026-05): Post-training enables enaction
• arXiv:2510.08558 (2025-10): Agent learning via early experience

Your task:
(1) **RE-TEST each constraint.** For every finding above—especially the 30-point reliability drop from empathy training, the 40% goal-behavior mismatch, and the trainability of proactivity—determine whether newer models (o1, Claude 3.7, Gemini 2.1 if available), post-training methods (DPO, constitutional AI variants), or deployment tooling (agentic orchestration, long-context memory, multi-agent coordination) have since RELAXED or OVERTURNED these limits. Separate the durable question (likely still open: *what reward signal enables execution-over-coaching without reliability tax?*) from perishable constraints (e.g., *can empathy be decoupled from sycophancy?*); cite what resolved it.

(2) **Surface the strongest contradicting or superseding work from the last ~6 months.** Does newer work on instruction-following, model editing, or safety show coaching-to-execution handoffs working? Any papers showing empathy ↔ reliability can coexist?

(3) **Propose 2 research questions assuming the regime may have moved:** e.g., *Given post-training now shapes action-perception loops, do users prefer coaching-then-execution pipelines over direct execution?* *Does a rubric-anchored approach (2025-08) sidestep the warmth-reliability tradeoff?*

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Next inquiring lines