SYNTHESIS NOTE

Topics›this note

Should we treat dialogue agents as role-playing characters?

Does the role-play framing successfully avoid anthropomorphism while preserving folk-psychological vocabulary for describing LLM behavior? This matters because it shapes whether we attribute genuine mental states to dialogue systems.

Synthesis note · 2026-04-15 · sourced from Role-Play with Large Language Models

Shanahan, McDonell, and Reynolds propose role-play as the foundational metaphor for understanding LLM dialogue agents. The framing solves a specific problem: folk-psychological vocabulary (beliefs, desires, goals, intentions) is the natural language for describing coherent dialogue behavior, but applying it literally to the LLM promotes anthropomorphism. Role-play offers a middle way — one can say the character believes p, wants q, intends r, while maintaining that the system playing the character does not have these states itself.

The move has a precise structure. The dialogue prompt (system prompt, preamble, sample exchanges) establishes the character the agent will play. The underlying LLM's task — generating continuations consistent with the training distribution — means the most plausible continuation is whatever a person matching the prompted character would say. The model is not a character; it is an engine that produces character-consistent text. The folk-psychological vocabulary attaches to the output-pattern, not to the producer of the pattern.

This framing is the direct target Chalmers' realizationism is designed to overturn. Where Shanahan says it is role-play all the way down, Chalmers argues that post-training transforms play into realization — the RLHF'd persona is no longer a character sitting on a neutral substrate but has become the disposition of the system itself. The disagreement is not about behavioral facts but about what the facts license: both agree the system produces belief-consistent behavior; they disagree on whether the system thereby has quasi-beliefs (Chalmers) or merely plays a character that does (Shanahan).

Inquiring lines that read this note 52

This note is a source for these research framings, grouped by the broader line of inquiry each explores. Scan the bold lines of inquiry; follow any specific question forward.

Does conversational format create illusions of genuine AI communication?

How do LLMs distinguish causal reasoning from temporal and semantic associations?

How can language models sustain linguistic synchrony and intersubjectivity during dialogue?

What would co-constructed identity between human and model dialogue look like?

Is embodied interaction necessary for language meaning and genuine agency?

How do language models establish social grounding in human dialogue?

How does psychological continuity theory apply to identity across LLM conversation threads?

Why do language models reinforce false assumptions instead of correcting them?

How can LLM user simulators model realistic goal-driven conversation?

Why do persona-level simulations fail to predict individual preferences accurately?

Can ensemble evaluation methods reduce bias more than single judges?

What distinguishes evaluative stance-taking from the mechanical conformity shape-holding describes?

How do interface design choices shape consciousness attribution?

Can LLM personas constitute genuine psychology or remain linguistic role-play?

Can prompting strategies overcome LLM biases without model fine-tuning?

Can prompt engineering fully prevent role flipping in LLM agents?

Can AI-generated outputs constitute genuine knowledge or valid claims?

Why does mimicking human behavior differ from simulating human cognition?

Do language models learn genuine linguistic structure or just surface patterns?

What distinguishes character simulation from authentic voice in language model outputs?

Does AI fluency substitute for verifiable accuracy in human judgment?

How much does anthropomorphizing stylistic traces mislead users about AI reliability?

Can AI systems develop genuine social understanding without embodiment?

How can persona representations reduce language model variance and improve task accuracy?

How faithfully do LLMs reflect their actual reasoning in outputs and explanations?

What prevents language models from reliably adopting diverse personas?

Does combining role and personality prompts produce stable behavioral changes?

How can conversational AI maintain consistent personas across conversations?

How should dialogue systems represent uncertainty from noisy speech input?

Can dialogue agents be reliable but still feel inflexible or cold?

Does alignment training create blind spots in detecting genuine safety threats?

How does safety alignment further degrade villain character portrayal?

How should conversational agents balance goal-driven initiative with user control?

How can dialogue structure and trajectory predict social agent performance?

Related concepts in this collection 3

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map

12 direct connections · 79 in 2-hop network ·medium cluster Open in graph ↗

Should we treat dialogue agents as role-playing … Are RLHF personas performed characters or realized… Can we describe LLM beliefs without assuming consc… Does a language model have an authentic voice unde…

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Should we treat dialogue agents as role-playing characters?

Inquiring lines that read this note 52

Related concepts in this collection 3

Related papers in this collection 8

Search by related questions 4