INQUIRING LINE

Inquiring lines›What do model internals reveal abo…›How do surface signals and framing…›What factors beyond surface conten…›this inquiring line

Catching incoherent text may require tracking who the story is about, not just the words it uses.

What role does entity salience play in detecting incoherence?

This explores how tracking the entities a text commits to — who or what it's about — helps catch when that text stops cohering, rather than detecting incoherence at the surface word level.

This explores how tracking the entities a text commits to — who or what it's about — helps catch when that text stops cohering, rather than detecting incoherence at the surface word level. The corpus doesn't have a paper labeled "entity salience" outright, but it has a strong adjacent thread: the most direct hit is work showing dialogue coherence fails in four distinct semantic modes, two of which are squarely about entities — contradiction and coreference inconsistency — and that these are only caught by Abstract Meaning Representation, which makes who-refers-to-whom explicit, where text-level manipulations alone miss them What semantic failures break dialogue coherence most realistically?. That's the core answer: incoherence often hides in the entity graph, not the wording, so a detector that doesn't track entities is blind to a whole failure class.

There's a deeper reason entity tracking is hard for these models, and it's worth knowing. The 20-questions regeneration test shows that an LLM doesn't actually commit to a single character or object — it holds a superposition and samples one at generation time, so regenerating the same prompt yields different-but-locally-consistent entities Do large language models actually commit to a single character?. If the model never firmly fixes an entity, coreference drift isn't a bug at the edges; it's baked into how generation works. That reframes "detecting incoherence" as detecting when the sampled entity has quietly shifted underneath consistent-looking prose.

The corpus also points to two contrasting ways to *find* that drift. One is meaning-level: semantic entropy clusters multiple sampled answers by whether they entail each other and flags divergence — a way of noticing when the "same" question produces semantically different commitments Can we detect when language models confabulate?. The other is structural and explicitly entity-aware: a learned verifier operating on full token-to-token similarity maps reliably rejects "structural near-misses" — things that look topically right but don't actually match — precisely because it reads the fine-grained interaction pattern rather than a compressed summary vector Can verification separate structural near-misses from topical matches?. Both say the same thing from different angles: salient entities are where coherence is won or lost, and you need a representation that keeps them visible.

Two more notes round out why this matters. Models routinely fail to integrate context when strong training-time associations override the entities actually present in the prompt — incoherence driven by the wrong entity being "loud" in parametric memory Why do language models ignore information in their context? — and they fail badly at holding multiple valid interpretations of ambiguous text at once Can language models recognize when text is deliberately ambiguous?. Put together, the picture you didn't know you wanted: "detecting incoherence" is less about catching contradictions in sentences and more about whether a system can keep track of which entities it has actually committed to — and the corpus suggests the surface text is the last place that breakdown shows up.

Sources 6 notes

What semantic failures break dialogue coherence most realistically?

Research using Abstract Meaning Representation identified four distinct incoherence types: contradiction, coreference inconsistency, irrelevancy, and decreased engagement. AMR-trained classifiers detect these semantic failures while text-level manipulations alone cannot.

Do large language models actually commit to a single character?

Shanahan's 20-questions test shows LLMs maintain a superposition of consistent objects or characters and sample from that distribution at generation time. Regenerating the same response yields different outputs, each consistent with prior context, proving no fixed commitment exists.

Can we detect when language models confabulate?

Clustering sampled answers by bidirectional entailment and computing entropy over semantic clusters catches confabulations invisible at token level. This self-referential approach works across tasks without task-specific training data.

Can verification separate structural near-misses from topical matches?

A two-stage pipeline—pooled-cosine recall followed by a small Transformer verifier operating on token-token similarity maps—reliably rejects structural near-misses that MaxSim-style late interaction cannot. The verifier succeeds because it operates on full token interaction patterns rather than compressed vectors.

Why do language models ignore information in their context?

Research demonstrates that LMs generate outputs inconsistent with their context because parametric knowledge from training dominates over in-context information. Textual prompting alone cannot override strong priors; causal intervention in representations is required.

Show all 6 sources

Can language models recognize when text is deliberately ambiguous?

AMBIENT benchmark shows GPT-4 correctly disambiguates only 32% of cases versus 90% for humans. This failure spans lexical, structural, and scope ambiguity—revealing that LLMs cannot hold multiple interpretations simultaneously, a fundamental gap hidden by standard benchmarks.

Papers this line draws on 8

The research behind the notes this line reads — ranked by how closely each paper relates.

Simple Linguistic Inferences of Large Language Models (LLMs): Blind Spots and Blinds1.70 match · arxiv ↗
Chain-of-Verification Reduces Hallucination in Large Language Models1.61 match · arxiv ↗
Detecting hallucinations in large language models using semantic entropy0.89 match · arxiv ↗
We’re Afraid Language Models Aren’t Modeling Ambiguity0.89 match · arxiv ↗
DEAM: Dialogue Coherence Evaluation using AMR-based Semantic Manipulations0.87 match · arxiv ↗
Interpretation modeling: Social grounding of sentences by reasoning over their implicit moral judgments0.86 match · arxiv ↗
Aligning Language Models to Explicitly Handle Ambiguity0.86 match · arxiv ↗
How new data permeates LLM knowledge and how to dilute it0.86 match · arxiv ↗

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a coherence analyst. The question remains: does entity salience — tracking WHO or WHAT a text commits to — reliably detect when that commitment breaks down, and can LLMs do it?

What a curated library found — and when (dated claims, not current truth):
Findings span 2022–2026. A curated library established:
• Dialogue incoherence hides in entity graphs (coreference drift, contradiction) rather than surface wording; Abstract Meaning Representation catches these, text-level methods miss them (2022).
• LLMs don't internally fix entities — they sample from a superposition at generation time, so regenerating the same prompt yields locally-consistent but globally-different entity commitments (2023).
• Semantic entropy (clustering by entailment) and learned token-similarity verifiers both reliably flag when "the same question" produces semantically divergent or structurally mismatched answers, catching entity drift (2024–2025).
• Models fail to integrate context when training-time entity associations override prompt-present entities, and fail badly at holding multiple valid interpretations of ambiguous text simultaneously (2023–2024).

Anchor papers (verify; mind their dates):
• arXiv:2203.09711 (2022): DEAM — AMR-based coherence eval.
• arXiv:2304.14399 (2023): Ambiguity recognition failure.
• arXiv:2401.06855 (2024): Fine-grained hallucination detection.
• arXiv:2603.29025 (2026): Surface heuristics override implicit constraints.

Your task:
(1) RE-TEST EACH CONSTRAINT. For the superposition claim: do newer training regimes (RL from human feedback, constitutional AI, consistency training ~2025) now *lock in* entity choice earlier in generation, collapsing the superposition? For semantic-level detection: do retrieval-augmented or latent-reasoning orchestrations (CLaRa, 2026) reduce entity drift by grounding to external context? For the "loud parametric memory" failure: do improved prompt-sensitivity methods (ProSA, 2024–2025) or positional bias mitigations (demo placement, 2025) now override that override? Separate the durable question (does entity coherence require explicit tracking?) from perishable limits (which training or inference methods have relaxed the drift problem).
(2) Surface the strongest CONTRADICTING or SUPERSEDING work from the last ~6 months. Do papers on compositional sensitivity (2026) or consistency training (2025) claim incoherence detection WITHOUT explicit entity graphs? Do any papers show that surface-level or dense-retrieval methods now catch entity drift?
(3) Propose 2 research questions that ASSUME the regime may have moved: (a) If consistency training or compositional RL now locks entity choice, does explicit AMR representation become redundant for detection, or does it remain necessary for *explanation*? (b) If multi-agent or memory-augmented inference resolves parametric-memory override, can we measure the trade-off between grounding cost and detection precision?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Catching incoherent text may require tracking who the story is about, not just the words it uses.

Related lines of inquiry

Sources 6 notes

Papers this line draws on 8