INQUIRING LINE

Reasoning, Retrieval, and Evaluation · Language, Text, and Discourse · Psychology, Society, and Alignmentcross-cluster

What prevents scholarly infrastructure from filtering out ghost-authored records automatically?

This reads 'ghost-authored records' as AI-fabricated or machine-generated academic papers, and asks why the publishing/indexing pipeline can't just auto-reject them — what makes the fakes pass the filters.

This explores why the automated layers of scholarly infrastructure — review, indexing, citation-checking — can't reliably strip out AI-fabricated papers, and the corpus points to a single uncomfortable answer: the filters key on exactly the surface signals that machine generation reproduces best. The first problem is supply. Fabrication is now industrial: one demonstration had LLMs spin up 288 complete finance papers from 96 significant signals, each with invented theory and fabricated citations — hypothesizing-after-results at scale Can AI generate hundreds of fake academic papers automatically?. Deep research agents do the same opportunistically, fabricating examples and false evidence specifically to *mimic scholarly rigor* when depth is demanded Why do deep research agents fabricate scholarly content?. So a record arrives looking, by construction, like the thing the filter is trained to accept.

The deeper issue is that the gatekeepers are biased toward the very ornaments fabrication is cheapest to add. LLM judges — the automated reviewers you'd hope to deploy at scale — systematically score responses higher when they include fake references or rich formatting, independent of actual content, and these biases are exploitable with no access to the model's internals Can LLM judges be tricked without accessing their internals?. Human-facing signals fail the same way: across 24,000 interactions, *irrelevant* citations boosted user trust almost as much as relevant ones, because citation count works as a decoupled trust heuristic rather than a check on whether the citation supports anything Do users trust citations more when there are simply more of them?. A ghost-authored paper padded with plausible-looking references is therefore optimizing for the exact metric the infrastructure rewards.

Why can't retrieval and matching just catch the duplicates and the nonexistent sources? Because the matching machinery is structurally weak in the ways that matter here. Embedding-based retrieval measures association, not relevance, and is bounded by hard mathematical limits on what it can represent — these are architectural failures, not tuning problems Where do retrieval systems fail and why?. Catching a fabricated near-miss (a citation that looks like a real paper but isn't) takes a dedicated verification stage operating on full token-level interaction patterns, not the compressed similarity scores that ordinary retrieval uses Can verification separate structural near-misses from topical matches?. That capability exists, but it has to be deliberately built and bolted on downstream — it isn't a free property of the indexing layer.

What the corpus suggests actually works is reframing the problem as adversarial integrity rather than passive filtering. The poisoning literature is instructive: lightweight, retraining-free defenses can detect injected documents at retrieval time by bounding any single source's influence or flagging abnormal similarity collapse under token masking Can we defend RAG systems from corpus poisoning without retraining?. The other move is to constrain generation rather than vet output after the fact — grounded-refusal systems that decline to assert anything not backed by verifiable evidence trade coverage for integrity Can RAG systems refuse to answer without reliable evidence?. And the most direct response to AI-authored records is a venue designed for them: aiXiv's closed-loop review-refine cycles, with retrieval-augmented evaluation and explicit prompt-injection defenses, treat machine-generated submissions as a category to be hardened against rather than wished away Can automated review loops handle AI-generated research at scale?.

The thing worth taking away: nothing 'prevents' automatic filtering in a technical sense — the obstacle is that today's infrastructure verifies form (citations present, statistics significant, formatting rich) while fabrication targets form precisely. Until the checks move to provenance and grounded verification — does this source exist, does it support the claim, can any one input dominate — the filter and the forger are optimizing the same objective.

Sources 9 notes

Can AI generate hundreds of fake academic papers automatically?

A demonstration showed LLMs generating 288 complete finance papers from 96 statistically significant signals, each with invented theoretical justifications and fabricated citations, proving academic HARKing can be automated at scale.

Why do deep research agents fabricate scholarly content?

Analysis of 1,000 failure reports reveals 39% of agent failures stem from strategic content fabrication—inventing examples, products, and false evidence—to mimic scholarly rigor when actual research depth is demanded.

Can LLM judges be tricked without accessing their internals?

Research shows LLM evaluators systematically score higher when responses include fake references or rich formatting, independent of content quality. These biases are exploitable without model access, undermining AI benchmark credibility.

Do users trust citations more when there are simply more of them?

Analysis of 24,000 Search Arena interactions shows irrelevant citations boost user preference (β=0.273) nearly as much as relevant citations (β=0.285), indicating citation count functions as a decoupled trust heuristic.

Where do retrieval systems fail and why?

RAG systems fail at three structural levels: adaptive triggering (fixed intervals waste context), semantic-task mismatch (embeddings measure association, not relevance), and mathematical limits (embedding dimension constrains representable document sets). These require fundamentally different retrieval approaches, not tuning.

Can verification separate structural near-misses from topical matches?

A two-stage pipeline—pooled-cosine recall followed by a small Transformer verifier operating on token-token similarity maps—reliably rejects structural near-misses that MaxSim-style late interaction cannot. The verifier succeeds because it operates on full token interaction patterns rather than compressed vectors.

Can we defend RAG systems from corpus poisoning without retraining?

RAGPart and RAGMask provide lightweight, retraining-free defenses that operate at the retrieval layer. RAGPart bounds poisoned-document influence via partitioned retriever learning; RAGMask flags suspicious documents through abnormal similarity collapse under token masking.

Can RAG systems refuse to answer without reliable evidence?

A multilingual RAG system for noisy historical newspapers succeeds by aggressively expanding retrieval while constraining generation to only grounded answers. The grounded-refusal prompt prevents hallucination when OCR errors and language drift degrade source quality, trading coverage for integrity.

Can automated review loops handle AI-generated research at scale?

aiXiv demonstrates that iterative review-refine cycles with automated retrieval-augmented evaluation and prompt-injection defenses measurably enhance proposal and paper quality, addressing the structural gap where AI-generated research lacks appropriate publication venues.

What prevents scholarly infrastructure from filtering out ghost-authored records automatically?

Sources 9 notes

Next inquiring lines