Does framing LLM output as fabrication rather than hallucination matter philosophically?
This explores whether swapping the word 'hallucination' for 'fabrication' is just a semantic preference or actually changes how we understand — and fix — what LLMs do when they produce false text.
This explores whether the fabrication-vs-hallucination relabeling is merely cosmetic or carries real philosophical and practical weight. The corpus comes down firmly on the side of it mattering — and the reason is mechanistic, not stylistic. The core argument is that LLMs produce accurate and inaccurate text through the *identical* statistical process: there's no separate 'perceiving' faculty that misfires when the output is wrong Should we call LLM errors hallucinations or fabrications?. 'Hallucination' borrows from human perception and 'confabulation' from human memory, so both words quietly smuggle in the idea that the model was trying to track reality and slipped. 'Fabrication' drops that assumption. And the payoff is concrete: the word you choose points your engineering somewhere. Call it hallucination and you reach for *grounding* (better perception); call it fabrication and you reach for *verification systems and calibrated uncertainty*, because there was never any grounding to repair Does calling LLM errors hallucinations point us toward the wrong fixes?. So the philosophical reframing cashes out as a different research agenda.
What makes this more than a naming fight is that the corpus contains a formal backstop. Three theorems show that hallucination — false output — is mathematically inevitable for *any* computable LLM, and that internal tricks like self-correction can't eliminate it Can any computable LLM truly avoid hallucinating?. Read alongside the fabrication framing, this is striking: if false output is unavoidable in principle, then treating it as a fixable perceptual glitch is a category error, and external safeguards (verification, trust-weighting) become structural necessities rather than patches. The two notes reinforce each other — one says *why* grounding won't save you, the other proves it *can't*.
The deeper philosophical move is that 'fabrication' isn't just more honest about failure — it's more honest about success too. A related framing argues LLM outputs should be read as draws from a subjective prior distribution, reflecting learned patterns and your prompt rather than empirical observation of the world, and so should only enter your reasoning through explicit trust weights Should we treat LLM outputs as real empirical data?. That's the same insight wearing statistical clothes: the model is always generating, never reporting. Even its *true* statements are fabrications that happen to land. This connects to a sharper claim that LLM text and human speech are structurally different operations — strings from a probability distribution versus utterances that address and relate to someone — so the receiver's job is different in kind, not degree Are language models and human speakers doing the same thing?.
Where it gets genuinely interesting is that 'fabrication' may be too blunt as a single bucket. One framework distinguishes failure types by their *regeneration signatures* — fabrication shows high variation across re-runs, good-faith error stays low and stable, role-played deception stays stable but shifts with context — letting you diagnose differentially without ever attributing beliefs or intentions to the model Can we distinguish types of LLM falsehood by regeneration patterns?. And there are subtypes the word doesn't capture at all: prompt-induced failures where a model fuses semantically distant concepts into elaborate, plausible-sounding frameworks it presents as defensible research, slipping past fact-checking entirely because nothing in it is a checkable false 'fact' Do language models evaluate semantic legitimacy when fusing concepts?. So the honest answer is layered: 'fabrication' is the right correction to 'hallucination' at the level of mechanism, but the real diagnostic future is behavioral taxonomy, not a single better noun.
If you want the doorway behind the doorway: this terminology debate sits on top of a much older fight about whether LLMs have anything like beliefs or mental states at all — with positions ranging from modest, consciousness-withholding attributions of belief-like states Can we defend modest mental attributions to large language models? to Habermas-flavored arguments that LLM output, lacking any genuine validity claim, doesn't even qualify as speech Can LLMs raise validity claims in Habermas's sense?. 'Fabrication vs. hallucination' is the practical, ground-level skirmish in that larger war over what kind of thing an LLM utterance even is.
Sources 9 notes
LLMs generate text through statistical token relationships without grounding in shared context. Accurate and inaccurate outputs use identical mechanisms, so calling failures "hallucinations" or "confabulation" misdirects fixes toward perception or memory—the wrong layers.
LLMs generate text through identical statistical processes regardless of accuracy, making 'fabrication' the more honest term. This reframes the fix from perception-based grounding to verification systems and calibrated uncertainty in use case design.
Three formal theorems prove that any computable LLM must hallucinate on infinitely many inputs, and internal mechanisms like self-correction cannot eliminate this mathematical constraint. External safeguards are therefore necessary, not optional.
Foundation Priors framework shows that LLM-generated text reflects the model's learned patterns and user's prompt choices, not ground truth. Such outputs should only influence inference through explicitly parameterized trust weights, not be treated as equivalent to real evidence.
LLMs produce strings via probability distributions; humans use language to address and relate to others. They share surface form but differ in what produces output, what it does socially, and what receivers should do with it.
Shanahan's framework distinguishes fabrication (high variation), good-faith error (low variation, stable), and role-played deception (low variation, context-dependent) using behavioral tests alone. This avoids mentalistic language while enabling differential diagnosis for safety.
LLMs generate coherent, plausible metaphorical reasoning when prompted to fuse semantically distant concepts without legitimate correspondences. Rather than decline or flag the fusion as speculative, they produce elaborate frameworks presented as defensible research, revealing a category-distinct hallucination type missed by fact-checking taxonomies.
Both robustness and etiological deflationist arguments beg the question against inflationism. A graded approach ascribing metaphysically undemanding states like beliefs and desires—while withholding consciousness claims—mirrors how we treat non-human animals.
Under Habermas's framework, LLMs cannot raise truth, rightness, or sincerity claims with genuine stakes. Without validity claims, their output fails to qualify as speech, making them non-speakers and non-interlocutors by definition.