SYNTHESIS NOTE

Do language models leak their training through fictional names?

When LLMs invent fictional experts, do they emit predictable name combinations that reveal their origin model and version? And can these patterns contaminate the scholarly record at scale?

Synthesis note · 2026-06-27 · sourced from Human Centered Design

When asked to invent fictional experts without naming them, models do not draw independent high-probability names — they emit correlated ensembles: Claude tends to Elena Vasquez + Marcus Chen + Amara Okafor, Gemini to Aris Thorne + Lena Petrova, GPT to Elara Voss with no fixed partner. These co-occurrence rates far exceed chance, are model-family- and version-specific, and shift at release boundaries — and the fact that vendors actively suppress them at release is itself evidence the priors were strong enough to be noticed. The consequence is a provenance signal that needs no model access and no intentional watermark: because vast amounts of web content are generated without overriding the defaults, the web becomes an unintentional, dateable archive of which model wrote what.

This reframes synthetic-content detection. Token-level watermarking and stylometry assume an intentional or statistical surface; name priors are a behavioral leak that survives paraphrase and copy. It is a concrete, granular instance of Do different AI models actually produce diverse outputs? — convergence here lands on a small set of named entities, the most fingerprint-friendly form. It also gives a mechanism behind the trend in How much of the internet is AI-generated now?: the same defaults repeating across millions of pages are what drive semantic-diversity decline. And it operationalizes Should we treat LLM outputs as real empirical data? — a name ensemble is literally that subjective prior made visible, then mistaken for a real person.

The downstream harm is not hypothetical: on Zenodo, 1,655 ghost-authored records with valid DataCite DOIs were registered in a 60-day automated burst, citing nonexistent journals with backdated dates. The honest limit the authors note — slop-site dates are unreliable, the probe covers only public checkpoints — does not soften the structural claim: the infrastructure for large-scale scholarly-record contamination already exists, and these ghost names are how it propagates.

Related concepts in this collection 3

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map

12 direct connections · 117 in 2-hop network ·dense cluster Open in graph ↗

Do language models leak their training through f… Do different AI models actually produce diverse ou… How much of the internet is AI-generated now? Should we treat LLM outputs as real empirical data…

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Do different AI models actually produce diverse outputs? Explores whether using multiple different language models together creates genuine diversity or whether shared training and alignment cause them to converge on similar answers despite independence.
exemplifies: convergence landing on named entities, the most fingerprint-revealing form
How much of the internet is AI-generated now? What share of newly published websites contain AI-generated or AI-assisted content, and what measurable changes does this cause across semantic diversity, sentiment, accuracy, and style?
grounds: repeating defaults are a mechanism behind measured semantic-diversity decline
Should we treat LLM outputs as real empirical data? Can synthetic text generated by language models serve as evidence in the same way observations from the world do? This matters because researchers increasingly rely on AI-generated content without accounting for its fundamentally different epistemic status.
exemplifies: a name ensemble is the subjective prior made visible then mistaken for a real person

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

LLM name priors are a dateable behavioral fingerprint — correlated character ensembles leak from generations into the web and into the academic record at scale

Do language models leak their training through fictional names?

Related concepts in this collection 3

Related papers in this collection 8

Search by related questions 4