SYNTHESIS NOTE
Language, Text, and Discourse

Where does LLM metaphor comprehension actually break down?

Literary metaphors range from conventional (dead metaphors) to novel conceptual mappings. This research asks whether LLMs fail predictably as metaphors become more abstract and creative, and what that tells us about their semantic reasoning limits.

Synthesis note · 2026-03-26
Where exactly do language models fail at structural language tasks?

These directions emerge from the convergence of findings across the vault. Each one is grounded in existing research and proposes a testable investigation.

1. The Metaphor Comprehension Spectrum. Where on the spectrum from dead metaphor ("table leg") to novel literary metaphor ("Memory, a jar of flies") does LLM comprehension break down? Conventional metaphors are lexicalized; novel metaphors require conceptual mapping between dissimilar domains. The metaphor extraction paper (Automatic Extraction of Metaphoric Analogies from Literary Texts) provides dataset and methodology; the pragmatic competence gap predicts the failure point.

2. The Rhetoric Analysis Paradox. Can LLMs identify rhetorical devices (anaphora, chiasmus, antithesis, litotes) in existing texts even though they cannot deploy them evaluatively? This tests whether recognition and production are dissociated for rhetoric, as Can LLMs generate more novel ideas than human experts? suggests. If LLMs can label a chiasmus but cannot explain why it is effective in context, that reveals the boundary between mechanical and meaningful analysis.

3. The Implicit Meaning Wall. Is there a fundamental ceiling on LLM literary analysis imposed by the implicit meaning deficit, and can chain-of-thought prompting breach it? Three findings converge: 24% on implicit discourse relations, 32% on ambiguity recognition, systematic failure on presuppositions. Since Can language models actually analyze language structure?, CoT may enable explicit decomposition of implicit structure. If not, LLM literary analysis has a hard boundary.

4. Style as Surface vs. Style as Substance. Can LLMs distinguish between stylistic features that carry semantic weight and those that are merely conventional? Authorship attribution at 95% shows style detection works at pattern level. The question is whether LLMs can interpret why a style choice matters — moving from pattern recognition to semantic interpretation of formal features.

5. The Evaluative Stance Problem for Literary Criticism. Can LLMs be prompted or fine-tuned to produce genuine literary criticism, or does the absence of evaluative stance-taking make literary judgment structurally inaccessible? Since Can models learn argument quality from labeled examples alone?, LLMs might produce literary criticism only when provided with explicit critical frameworks (New Criticism, reader-response theory) as scaffolding.

6. Cross-Text Analogical Reasoning. Can LLMs identify structural analogies between texts — recognizing that Kafka's Metamorphosis and Ovid's Metamorphoses share transformation-as-identity-crisis, or that Moby-Dick and The Old Man and the Sea explore obsession-futility through opposed scales? Since Do large language models reason symbolically or semantically?, cross-text analogy (conceptual, not lexical) predicts failure. But metalinguistic capabilities and compositional generalization at scale might help.

7. The Compression-Nuance Trade-off in Literary Language. Does LLM semantic compression systematically destroy the features that make literary language literary? Testable by having LLMs paraphrase poetry and measuring which dimensions of meaning survive versus collapse. If compression preserves denotation but destroys connotation, that quantifies the gap between understanding what a text says and what a text means.

Inquiring lines that use this note as a source 4

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 3

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
12 direct connections · 107 in 2-hop network ·medium cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

seven research directions for LLM literary analysis — from metaphor comprehension spectra to compression-nuance trade-offs