SYNTHESIS NOTE

Do large language models persuade better than humans?

Does LLM persuasiveness hold up when humans have real financial incentives to win? And does the advantage look the same across different models and persuasion goals?

Synthesis note · 2026-05-02 · sourced from Argumentation

The Schoenegger 2025 design closes a long-standing gap in persuasion research: human persuaders had real financial incentives to win, and quiz takers had incentives to answer correctly. Under those conditions, the headline "LLMs are more persuasive than humans" splits along two seams that the popular framing collapses.

First, direction matters. Claude 3.5 Sonnet beat incentivized human persuaders in both truthful and deceptive contexts — increasing accuracy when nudging toward correct answers and decreasing it when nudging toward wrong answers. DeepSeek v3 beat humans only in the deceptive direction. So "more persuasive" is not a property of LLMs as a class; it is a property of specific architectures interacting with specific persuasion goals.

Second, the asymmetry survives the incentive control. Critics of earlier persuasion studies could plausibly argue that humans were not really trying. Schoenegger pays them. The advantage holds anyway — at least for Claude across both directions and for DeepSeek in the deceptive direction. This is the strongest version of the claim available.

This refines Where does AI's persuasive power actually come from?. The Levers paper documented a tradeoff between persuasiveness and accuracy at the training-method level. Schoenegger gives behavioral evidence at the deployment level: the same model wins toward truth and toward falsehood, which means the persuasion mechanism is content-independent. The model is not arguing better when it argues for true claims — it is arguing equally well in both directions.

Connects also to Does any single persuasion technique work for everyone? in an unexpected way: model family is itself a contextual moderator. The persuasion-effectiveness landscape is not Claude-vs-DeepSeek-vs-humans on a single axis; it is a multidimensional surface where direction, model, and recipient interact.

For writing about AI persuasion, the operational implication: refuse the singular question "are LLMs more persuasive than humans?" The right form is "which LLM, in which direction, against which audience?"

Inquiring lines that read this note 31

This note is a source for these research framings, grouped by the broader line of inquiry each explores. Scan the bold lines of inquiry; follow any specific question forward.

What makes AI persuasion effective and how can we counter it?

How should human oversight be integrated with autonomous AI systems?

Can removing human labor from influence operations change how constrained these campaigns become?

Does conversational format create illusions of genuine AI communication?

Does conversational format make AI arguments more persuasive than static text?

How does rhetorical adaptation affect LLM persuasion and detectability?

Does RLHF training sacrifice accuracy and grounding for user agreement?

What training methods make models more persuasive but less factually accurate?

How should models express uncertainty rather than forced confident answers?

Does uncertainty quantification in model responses reduce persuasive impact on audiences?

Can prompting inject entirely new knowledge into language models?

How do prompt design and training choices shift persuasive outcomes measurably?

Does AI fluency substitute for verifiable accuracy in human judgment?

Why does polished explanation make wrong AI systems more persuasive than poorly explained ones?

Why do language models reinforce false assumptions instead of correcting them?

What design choices actually make language models more persuasive?

How can persona representations reduce language model variance and improve task accuracy?

Why does personal authenticity matter more for human persuasion than LLM?

How do evaluation biases undermine LLM quality assessment systems?

Can LLM persuasion be fairly evaluated without stratifying by reader background?

Related concepts in this collection 2

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map

12 direct connections · 71 in 2-hop network ·medium cluster Open in graph ↗

Do large language models persuade better than hu… Where does AI's persuasive power actually come fro… Does any single persuasion technique work for ever…

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Where does AI's persuasive power actually come from? Explores which techniques make AI most persuasive—and whether the usual suspects like personalization and model size are actually the main drivers. Matters because it reshapes where to focus AI safety concerns.
training-level mechanism this insight gives deployment-level evidence for
Does any single persuasion technique work for everyone? Can fixed persuasion strategies like appeals to authority or social proof be reliably applied across different people and situations, or do they require adaptation to individual traits and context?
model family is itself a moderator

Do large language models persuade better than humans?

Inquiring lines that read this note 31

Related concepts in this collection 2

Related papers in this collection 8

Search by related questions 4