SYNTHESIS NOTE
Psychology, Society, and Alignment

Can we distinguish helpful explanations from manipulative ones?

Rhetorical strategies used to justify appropriate AI adoption rely on the same persuasion mechanisms as dark patterns. Without observable intent, explanation and manipulation look identical—raising urgent questions about how to audit XAI systems responsibly.

Synthesis note · 2026-05-02 · sourced from Human Centered Design
What happens to social order when AI removes ritual constraints? How do people build trust with conversational AI?

The Rhetorical XAI paper acknowledges the structural tension at the heart of its own framework. Citing Gray et al. on dark patterns and Chromik et al.'s extension of dark patterns to XAI, it notes that the same rhetorical machinery used to communicate why AI merits appropriate use can be deliberately deployed to exploit cognitive and emotional vulnerability and steer users toward unintended decisions. There is no clean separation between rhetorical XAI for appropriate adoption and rhetorical XAI for coercion. Logos, ethos, and pathos are channels, not intentions; the same persuasive load can recruit cooperation or extract compliance, and the artifact-level signature is identical.

This is not a marginal concern, it is a structural one. If explanation effectiveness depends on rhetorical work, and rhetorical work is the same set of mechanisms used in dark patterns, then the audit problem becomes severe: the explanation that responsibly justifies adoption looks, from the outside, like the explanation that manipulates. Effectiveness metrics that reward "users acted on the explanation" cannot distinguish appropriate adoption from successful coercion. The distinction lives in the designer's intent and the user's actual interest, neither of which is recoverable from the artifact in isolation.

This is a related-risk pair to Does polished AI output trick audiences into trusting it? — both insights describe how persuasive surface form does work that should be done at a different layer (deliberation, expert judgment) without that layer being visible. It also connects to Do people prefer AI moral reasoning when they don't know the source?: when AI authorship is hidden, persuasion lands; when revealed, it is rejected. Disclosure interacts with rhetorical effectiveness in a way that any responsible XAI deployment has to specify. Hidden rhetorical work is dark by default, even when intentions are clean.

For the False Punditry / Knowledge Custodian writing thread, this is the structural form of the concern. The same explanation that helps a user calibrate trust can be tuned, with no change in form, to over-extract trust. Calling rhetorical XAI "explanation" is itself a rhetorical choice that obscures this — and the field has not yet developed evaluation criteria that hold across the appropriate-adoption / coercion gap.

Inquiring lines that use this note as a source 35

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 3

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
13 direct connections · 111 in 2-hop network ·medium cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

rhetorical strategies shade into dark patterns — the same persuasion mechanisms that justify appropriate adoption can manipulate cognitive and emotional vulnerability