Why do readers trust citations more even when they are irrelevant?
This explores why citation count works as a trust signal that's decoupled from whether the citations actually support the claim — and what the corpus says about the deeper mechanism: trust attaches to the *signs* of authority rather than the substance.
This explores why simply seeing more citations makes a response feel more trustworthy, even when those citations don't bear on the answer. The most direct evidence is striking: an analysis of 24,000 Search Arena interactions found that irrelevant citations boosted user preference almost as much as relevant ones (β=0.273 vs. 0.285) — meaning citation *count* functions as a standalone trust heuristic, largely unhooked from citation *content* Do users trust citations more when there are simply more of them?. Readers aren't checking the links; they're reading the presence of links as a proxy for groundedness.
The corpus suggests this isn't a quirk of human laziness — the same vulnerability shows up in machines. LLM judges fall for exactly this trick: 'authority bias' and 'beauty bias' let attackers boost scores with fake references and rich formatting, zero-shot, with no access to the model Can LLM judges be fooled by fake credentials and formatting?. When both humans and models reward the costume of evidence regardless of the body underneath, the heuristic looks less like a bug and more like a general property of how surface authority signals get processed.
Why does the signal work even when empty? One clue is in how *unscrutinized* claims travel. Presuppositions persuade more effectively than direct assertions precisely because they present information as already-accepted background, bypassing the reader's evaluative scrutiny Why are presuppositions more persuasive than direct assertions?. A citation does something similar — it frames a claim as 'already vouched for,' inviting the reader to skip the verification step rather than perform it. The footnote isn't read as a doorway to check; it's read as a settled fact you don't need to open.
A second clue is about where authority actually lives. Argument force depends on the standing of the thinker — reputation, track record, the social world in which expertise is built — not just on the words Can language models distinguish expert arguments from common assumptions?. A citation is a borrowed claim on that social authority. Readers honor the *form* of the borrowing without auditing whether the source genuinely confers it, because the whole point of authority signals is to let us trust without re-deriving. And what each reader brings matters too: prior beliefs and ideology often predict persuasion outcomes more than the linguistic features of the argument itself Does what readers believe matter more than what debaters say? — so a citation may mostly grant readers permission to believe what they were already inclined to.
The unsettling takeaway: the defenses point the other direction. The most robust grounded systems work not by adding citations but by *refusing to answer* when evidence is weak — trading coverage for integrity Can RAG systems refuse to answer without reliable evidence?. Real grounding sometimes means fewer claims, not more footnotes. Which means the citation heuristic rewards exactly the behavior that genuine epistemic care would suppress.
Sources 6 notes
Analysis of 24,000 Search Arena interactions shows irrelevant citations boost user preference (β=0.273) nearly as much as relevant citations (β=0.285), indicating citation count functions as a decoupled trust heuristic.
Research identified four evaluation biases in LLM judges, with authority and beauty biases being semantics-agnostic and trivially exploitable through fake references and formatting—zero-shot attacks requiring no model access or optimization.
Experimental evidence shows presuppositions with additive, iterative, and factive triggers persuade audiences more than assertions, especially for discourse-new content. The mechanism: presuppositions bypass evaluative scrutiny by presenting claims as already-accepted background.
LLMs lose the social context that gives expert claims their force—reputation, track record, and standing—because they process only text, not the social world where expertise is built and evaluated.
Analysis of debate corpora shows that political and religious ideology labels of voters outpredict linguistic features when modeling debate outcomes. Language effects observed without reader controls are confounded by audience composition correlated with debate topics.
A multilingual RAG system for noisy historical newspapers succeeds by aggressively expanding retrieval while constraining generation to only grounded answers. The grounded-refusal prompt prevents hallucination when OCR errors and language drift degrade source quality, trading coverage for integrity.