Can large language models translate natural language to logic faithfully?

This explores whether LLMs can convert natural language statements into formal logical representations without losing meaning. It matters because faithful translation is essential for any AI system that reasons formally or verifies specifications.

Synthesis note · 2026-02-21 · sourced from Natural Language Inference

"Faithful Autoformalisation with LLMs" evaluates whether LLMs can translate natural language statements into formal logical representations (first-order logic, modal logic, higher-order logic). The central finding: LLMs can produce syntactically well-formed logical expressions but fail to produce semantically faithful ones. Form and content come apart.

The failure modes are not random — they cluster at precisely the points where natural language semantics are most structurally complex:

Scope ambiguity resolution: "Every student passed some exam" requires committing to a reading (∀∃ or ∃∀). LLMs tend to produce the surface-order reading rather than the contextually correct one, or produce an unambiguous formula that does not capture the original ambiguity.
Quantifier precision: Natural language quantifiers ("most", "few", "many") have no standard first-order equivalents. LLMs substitute familiar quantifiers (∀, ∃) that alter the truth conditions.
Predicate granularity: Natural language predicates compress relational structure. "John gave Mary the book" has multiple relational decompositions in FOL, and LLMs often choose flat single-predicate representations that lose the event structure.

This is a structurally significant finding because autoformalisation is a direct test of the relationship between linguistic competence and logical competence. If LLMs can handle syntax-level structure (as they demonstrably can for many tasks) but fail when that structure requires semantic commitment, then their linguistic processing operates at a level that stops short of truth-conditional content.

The finding connects to the broader pattern that Can models pass tests while missing the actual grammar?. For autoformalisation, "correct output" would mean syntactically valid logic; "genuine linguistic generalisation" would require semantic faithfulness. LLMs achieve the former but not the latter.

This also extends Do language models actually use their encoded knowledge? — even when LLMs encode semantic properties, that encoding does not reliably translate into semantic commitment during generation. The representation has the information; the generation does not use it.

Structured semantics understanding vs. generation asymmetry: "Probing Structured Semantics Understanding and Generation" (2401.05777) confirms the generation < understanding asymmetry for formal languages. LLMs can interpret formal language (translate logical form to natural language question) more accurately than they can generate it (translate natural language to logical form). Furthermore, formal language complexity matters: lower formalization (closer to natural language, like KoPL) is easier for models; higher formalization (SPARQL, Lambda DCS) fails at entity grounding. This suggests the autoformalisation failure is graded — the farther the target representation from natural language surface form, the worse the semantic fidelity.

The practical implication: LLM-assisted formal verification, specification writing, or logical reasoning pipelines cannot trust LLM-generated logical forms without post-hoc semantic verification. The syntactic plausibility of the output masks the semantic errors.

Inquiring lines that read this note 19

This note is a source for these research framings, grouped by the broader line of inquiry each explores. Scan the bold lines of inquiry; follow any specific question forward.

What structural advantages do diffusion language models offer over autoregressive methods?

Can autoregressive models learn faithful translation to logical representations without semantic loss?

Do language models learn genuine linguistic structure or just surface patterns?

Do language models perform faithful symbolic reasoning independent of semantic grounding?

Do accurate-looking LLM outputs hide structural failures in learning and reasoning?

Which use cases can tolerate unverified LLM outputs without external verification?

How faithfully do LLMs reflect their actual reasoning in outputs and explanations?

How effectively do deterministic tools improve language model reasoning on formal tasks?

Why does moving verifier synthesis to the LLM extend verification beyond math and code domains?

Do language models understand semantics or rely on pattern matching?

What semantic information is necessary to preserve for sound LLM reasoning?

Why does verification consistently lag behind AI generation?

What makes line-by-line proof checking a good fit for AI verification?

Related concepts in this collection 6

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map

23 direct connections · 165 in 2-hop network ·medium cluster Open in graph ↗

Can large language models translate natural lang… Can models pass tests while missing the actual gra… Do LLMs predict entailment based on what they memo… Can language models learn meaning from text patter… Why do embedding contexts confuse LLM entailment p… Should LLMs handle abstraction only in optimizatio… Can we automatically generate formal verifiers fro…

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Can models pass tests while missing the actual grammar? Do language models succeed on grammatical benchmarks by learning surface patterns rather than structural rules? This matters because correct outputs may hide reliance on shallow heuristics that fail on novel structures.
same gap: syntactic correctness without semantic accuracy
Do LLMs predict entailment based on what they memorized? Explores whether language models make entailment decisions by recognizing memorized facts about the hypothesis rather than reasoning through the logical relationship between premise and hypothesis.
both failures trace to LLMs not committing to truth-conditional content
Can language models learn meaning from text patterns alone? Explores whether training on form alone—predicting the next word from prior words—could ever give language models access to communicative intent and genuine semantic understanding.
autoformalisation failure is a concrete manifestation of this structural claim
Why do embedding contexts confuse LLM entailment predictions? Can language models distinguish between contexts that preserve versus cancel entailments? The study explores whether LLMs systematically fail to apply the semantic rules governing presupposition triggers and non-factive verbs.
related: logical connectives and quantifiers show the same opacity as presupposition triggers
Should LLMs handle abstraction only in optimization? What if LLMs worked exclusively on translating problems to formal constraints, while deterministic solvers handled the numeric work? Explores whether this division of labor could overcome LLM failures in iterative computation.
both ally and tension: the abstraction-only paradigm explicitly relies on LLMs doing well at exactly the NL→formal translation step this note finds them failing at. The two notes together imply that the paradigm needs verification machinery for the formal output, not just trust that the translation is faithful — and that the paradigm's effectiveness ceiling is bounded by the autoformalisation failure rate.
Can we automatically generate formal verifiers from policy text? Verifier scarcity blocks process verification in most domains. Can language models synthesize correct-by-construction formal checkers directly from natural-language policies, bridging informal rules and rigorous proof?
grounds the weakest-link warning: auto-synthesized verifiers inherit the prose-to-formal translation failure this note documents

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

llms fail at faithful autoformalisation because they cannot translate natural language to logical representations without semantic loss

Can large language models translate natural language to logic faithfully?

Inquiring lines that read this note 19

Related concepts in this collection 6

Related papers in this collection 8

Search by related questions 4