SYNTHESIS NOTE
Psychology, Society, and Alignment Language, Text, and Discourse

What combination of factors explains differences in LLM persuasiveness?

Why do some LLM persuasion studies show strong effects while others show none? This explores whether model choice, conversation design, and topic domain together predict when AI actually persuades.

Synthesis note · 2026-05-02 · sourced from Argumentation
Why do AI conversations reliably break down after multiple turns? Does personalization in AI increase trust or manipulation risk?

When the Bilstein meta-analysis tested moderators individually, none reached significance — likely a power problem with only 7 studies. But the joint model combining LLM model family, conversation design (one-shot vs interactive multi-turn), and domain (health, political, etc.) explained R² = 81.93% of between-study variance and dropped residual heterogeneity from I² = 75.97% to I² = 35.51%. The conditional patterns reported, holding other factors constant: interactive multi-turn outperformed one-shot formats; GPT-4-based models outperformed Claude 3.x; health topics yielded stronger effects than political ones.

This is the operational corollary of Are language models actually more persuasive than humans?. The pooled-null result and the joint-moderator result are not in tension — they are two sides of the same finding. Average effect ≈ 0; conditional effect = whatever the model × design × domain combination dictates. The persuasive footprint is in the dial settings, not in the category.

The multi-turn-beats-one-shot finding reweights design priorities. It connects directly to Why do AI conversations reliably break down after multiple turns? as a topic area: persuasive influence accrues across turns, and conversational architecture is consequential for outcomes that one-shot generation cannot reach. This also intersects with Does AI persuasiveness fade across repeated conversations with the same person? in a productive tension. Bilstein finds interactive setups more persuasive than one-shot in pooled terms; Schoenegger finds persuasive advantage over humans waning across rounds. Both can be true: the multi-turn benefit is real but is a benefit shared with human persuaders, while the LLM-specific edge is concentrated at first contact.

The model-family signal (GPT-4 > Claude 3.x in this corpus) cautions against generalizing from any single model. Claims about "LLM persuasiveness" anchored to one architecture should be read as architecture-specific until replicated.

For writing about AI persuasion, the operational rule: don't quote a single-study effect size. Cite the meta-analytic null, then specify the dial settings under which a conditional effect appears.

Inquiring lines that use this note as a source 19

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 3

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
12 direct connections · 93 in 2-hop network ·medium cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

combined moderators — model conversation design and domain — explain ~82% of between-study variance and interactive multi-turn beats one-shot