SYNTHESIS NOTE
Psychology, Society, and Alignment Language, Text, and Discourse Reasoning, Retrieval, and Evaluation

Should we treat LLM outputs as real empirical data?

Can synthetic text generated by language models serve as evidence in the same way observations from the world do? This matters because researchers increasingly rely on AI-generated content without accounting for its fundamentally different epistemic status.

Synthesis note · 2026-04-19 · sourced from Context Engineering
What do language models actually know? How do you build domain expertise into general AI models?

A "subtle shift in the meaning of data" is underway: knowledge once derived from empirical observation is now supplemented, or replaced, by information co-produced through human-model interaction. The Foundation Priors paper (2024) provides a formal statistical framework for understanding this shift. LLM-generated outputs are not observations from the world — they are draws from a foundation prior, an intractable, subjectively malleable distribution that reflects both the model's learned patterns and the user's subjective filters.

The provenance of such data is fundamentally uncertain. We have minimal visibility into model architecture and training data, and the prompt design process injects the user's own priors, beliefs, and preferences into the generation mechanism. This makes the generated data epistemically different in kind from empirically collected data, however similar in surface form.

The practical implication is that generative outputs should influence inference only through an explicitly parameterized trust weight (λ) and never by being treated as if drawn from the same process as empirical observations. When framed this way, synthetic data become a source of structured prior information rather than a surrogate for real evidence. The tools the paper develops — integrating across heterogeneous prompts, tempering synthetic data influence through conservative trust, calibrating effect using real observations — formalize what the vault's Tokenization framework describes informally: AI outputs have exchange value (they look and trade like knowledge) but their use value (whether they actually work under their claims) requires independent verification.

Since Does iterative prompt engineering undermine scientific validity?, the Foundation Priors framework provides the formal statistical apparatus for that methodological critique. The self-fulfilling prophecy IS epistemic circularity: prompt iteration reinforcing user priors without empirical anchoring.

Inquiring lines that use this note as a source 38

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 6

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
16 direct connections · 136 in 2-hop network ·medium cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

LLM outputs are draws from a subjective prior distribution not empirical observations — treating synthetic data as real evidence conflates structured belief with ground truth