SYNTHESIS NOTE
Psychology, Society, and Alignment

Do reasoning scaffolds reshape which empathy skills models develop?

When language models receive identical empathy rewards, does adding explicit reasoning blocks before responses change which capabilities they actually improve? This matters for understanding how training structure, not just training signal, shapes model development.

Synthesis note · 2026-02-22 · sourced from Psychology Empathy
What kind of thing is an LLM really? How should we allocate compute budget at inference time? How should researchers navigate LLM reasoning research?

Under RLVER training with identical verifiable emotion rewards, models with and without explicit reasoning scaffolds develop along different axes:

This divergence under the same training signal is the key finding. The explicit reasoning scaffold doesn't just improve the model — it redirects what the model improves at. The think-then-say template forces the model to "access and refine higher-order empathetic skills" by externalizing its reasoning about the user's emotional state before responding.

This connects to the broader reasoning literature in two ways:

First, it parallels Does RL teach reasoning or just when to use it? — the thinking scaffold provides a pre-existing mechanism (extended deliberation), and RL teaches the model when and how to apply that mechanism to empathetic dialogue. The capability was latent; RL surfaces it through the scaffold.

Second, it complicates When does explicit reasoning actually help model performance?. Empathy is arguably a "continuous nuanced judgment" task, yet the thinking scaffold helps. The resolution may be that the scaffold here works not by imposing logical structure on empathy, but by creating space for the model to deliberate about social context before committing to a response.

Inquiring lines that use this note as a source 5

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 3

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
14 direct connections · 173 in 2-hop network ·dense cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

Thinking and non-thinking models develop distinct empathy profiles under RL training — thinking models enhance empathy and insight while non-thinking models focus on action-oriented capabilities