SYNTHESIS NOTE
Reasoning, Retrieval, and Evaluation Model Architecture and Internals Training, RL, and Test-Time Scaling

Can pretraining data statistics detect hallucinations better than model confidence?

Explores whether checking whether entity combinations appeared in training data is a more reliable hallucination signal than measuring the model's own confidence levels, especially for catching confidently-wrong outputs.

Synthesis note · 2026-05-03
Where do retrieval systems fail and why?

Adaptive RAG systems decide when to retrieve based on the model's own confidence: if the model is uncertain, fetch external evidence. But confidence is a notoriously bad hallucination signal — models often produce confidently wrong outputs precisely on entities they have seen rarely or never seen together. QuCo-RAG bypasses confidence entirely and uses pretraining-data statistics directly: it checks whether the entities mentioned in a query are rare and, more importantly, whether the specific entity combinations have co-occurred in real data. If a query mentions two entities that the model's training corpus never saw in proximity, that is the retrieval trigger.

The methodological move is replacing an internal symptom (low confidence) with an external cause (data sparsity). Hallucination is what happens when the model interpolates over combinations it never saw; checking pretraining co-occurrence catches the condition before the symptom rather than after. This means QuCo-RAG can flag suspicious outputs even when the model is highly confident, which is the regime where calibration-based methods fail hardest. This stance is in direct tension with When should retrieval happen during model generation?, which treats confidence as the right trigger — see ops/tensions/retrieval trigger signal — pretraining-data statistics vs model uncertainty.md for the full disagreement.

The cost is access to pretraining-data statistics, which is non-trivial for opaque models but tractable for open-weight ones. The deeper implication is that hallucination detection may benefit more from data-side instrumentation than from probing the model's internal states — the training distribution is the ground truth about what the model can reasonably know, and confidence is only a noisy proxy for that.

Inquiring lines that use this note as a source 48

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 5

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
13 direct connections · 131 in 2-hop network ·dense cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

pretraining-data statistics should trigger retrieval not model confidence — rare entity co-occurrence flags hallucination risk that calibration cannot detect