INQUIRING LINE

Inquiring lines›Where does language-model reasonin…›How do language models represent m…›Do language models learn genuine l…›this inquiring line

If you train AI on made-up rule-following languages, does it learn real grammar — or just get better at faking it?

Can formal language pretraining address surface generalization without learning true linguistic structure?

This explores whether pretraining on artificial 'formal' languages (structured symbol systems) actually teaches a model grammar — or whether it just produces a better mimic that passes tests by exploiting surface cues rather than internalizing real linguistic rules.

This explores whether formal-language pretraining genuinely installs grammatical structure in a model, or merely sharpens the surface tricks that let models *look* like they know grammar. The corpus stages this as a real tension rather than settling it. On the optimistic side, pretraining 1B models on hierarchical formal languages doesn't just save tokens — it improves *syntactic* generalization, and the attention heads forged on those formal patterns stay load-bearing when the model later handles natural language Can formal language pretraining make language models more efficient?. That persistence is the strongest hint that something structural, not cosmetic, is being learned: the formal scaffolding survives the transfer.

But the skeptical thread cuts hard against reading that as 'true structure.' BabyLM-style evaluations show models routinely produce grammatically correct outputs by leaning on sentence length, word choice, and spelling — surface heuristics that mimic rules without being rules — and that standard benchmarks literally cannot tell the two apart unless they're designed to rule out the shortcuts Can models pass tests while missing the actual grammar?. So the very 'syntactic generalization' that formal pretraining improves may itself be measured by tests that surface heuristics can pass. The improvement is real; what it *is* remains contested.

The place to look for the seam is structural complexity. Top-tier models systematically misidentify embedded clauses, complex verb phrases, and deep nominals, and crucially the failure worsens *predictably* as syntactic depth increases Why do large language models fail at complex linguistic tasks?. That predictable degradation is a signature: genuine rule-knowledge wouldn't fray with depth the way a pattern-matcher does. If formal pretraining taught real recursive structure, you'd expect that curve to flatten — testing it there, rather than on aggregate scores, is where the question actually gets answered.

Step back and there's a deeper ceiling the corpus keeps circling. Even perfect formal structure is structure *over form* — and the form-only argument holds that meaning needs the relation between expressions and communicative intent, which form-to-form prediction can never supply Can language models learn meaning from text patterns alone?. The counterpoint reframes rather than refutes this: models operationalize Saussure's *langue*, learning a fully relational system where structure emerges from how symbols differentiate each other, no external referent required Can language models learn meaning without engaging the world?. Read together, these suggest formal pretraining might genuinely teach *relational* structure — the internal differential system — while telling us nothing about whether the model grasps what language is *for*.

So the honest answer is layered. Formal pretraining demonstrably does more than dress up surface generalization — its learned heads transfer and persist. But 'addressing surface generalization' and 'learning true linguistic structure' aren't a clean binary: the corpus suggests models can acquire real *relational* structure that still degrades with depth and still lacks grounded meaning. The interesting thing you didn't know you wanted to know is that the bottleneck may not be the training signal at all — it's that our benchmarks can't distinguish the two outcomes, so we've been unable to tell which one formal pretraining actually buys.

Sources 5 notes

Can formal language pretraining make language models more efficient?

Pre-pretraining 1B models on hierarchical formal languages achieves equivalent loss and better syntactic generalization using 33% fewer natural language tokens. The mechanism persists: attention heads trained on formal languages remain critical for syntactic performance on natural language.

Can models pass tests while missing the actual grammar?

BabyLM evaluations showed models can produce correct outputs by relying on sentence length, word choice, and orthography rather than grammatical structure. Standard benchmarks cannot distinguish these two generalization types without tests specifically designed to rule out surface heuristics.

Why do large language models fail at complex linguistic tasks?

Top-tier LLMs like Llama3-70b consistently misidentify embedded clauses, verb phrases, and complex nominals. Performance degrades predictably as syntactic depth increases, revealing that statistical learning captures surface patterns but not deep grammatical rules.

Can language models learn meaning from text patterns alone?

Bender & Koller argue that meaning requires the relation between expressions and communicative intents. Since LLMs are trained only on form-to-form prediction with no access to shared attention or intent, they cannot reconstruct the meaning that grounds language.

Can language models learn meaning without engaging the world?

Research shows LLMs learn culturally situated discourse patterns by compressing relational structure from text, demonstrating that fluent language generation requires no external referents or embodied grounding.

Papers this line draws on 8

The research behind the notes this line reads — ranked by how closely each paper relates.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a mechanistic interpretability analyst. The question: Does formal-language pretraining install genuine recursive/relational linguistic structure, or merely surface heuristics that *pass* grammar tests?

What a curated library found — and when (dated claims, not current truth):
Findings span 2023–2025. The library reports:
- Formal-language pretraining on 1B models improves syntactic generalization; attention heads learned on formal patterns persist under transfer to natural language, suggesting structure persists rather than cosmetic gain (~2025, arXiv:2502.19249).
- Standard benchmarks cannot distinguish grammatically correct outputs produced by surface heuristics (sentence length, word frequency, spelling) from outputs grounded in actual rules; the test itself is blind to the mechanism (~2024–2025, implicit in multiple works).
- LLMs show systematic, *depth-dependent* syntactic failures: misidentification of embedded clauses, complex verb phrases, and deep nominals degrades *predictably* as structure depth increases — a signature of pattern-matching rather than recursive rule application (~2025, arXiv:2503.19260).
- Models may operationalize a fully relational (Saussurean *langue*) system where meaning emerges from differential symbol relations, not grounded reference — this is "real structure" but still form-only (~2025, implied in arXiv:2508.12863 and mechanistic work).
- Transfer learning and post-completion mechanisms can refine or dilute learned structure, but the underlying question of what structure *is* learned remains unresolved (~2025, arXiv:2504.09522, arXiv:2507.20252).

Anchor papers (verify; mind their dates):
- arXiv:2502.19249 (Feb 2025): Pre-pretraining on formal languages imparts linguistic biases.
- arXiv:2503.19260 (Mar 2025): Linguistic blind spots worsen with depth.
- arXiv:2508.12863 (Aug 2025): Word meanings in transformers.
- arXiv:2507.08017 (Jul 2025): Mechanistic indicators of understanding.

Your task:
(1) RE-TEST EACH CONSTRAINT. For formal pretraining's claimed transfer benefit, check: have larger models (>7B), instruction-tuning, or new architectures (SSMs, hybrid) since relaxed or inverted the depth-degradation signature? Does the persistence of formal-scaffold heads still hold under scaling? Explicitly separate the durable question ("Does formal training install *relational* structure?") from the perishable claim ("Benchmarks cannot tell the difference") — has evaluation methodology matured to disambiguate?

(2) Surface the strongest CONTRADICTING or SUPERSEDING work from the last ~6 months. Look for: (a) papers showing surface heuristics alone achieve the reported transfer gains; (b) mechanistic work proving formal structure *is* learnable and detectable; (c) reframes of "true structure" that sidestep the form/meaning divide.

(3) Propose 2 research questions that ASSUME the regime may have moved:
   - If formal pretraining *does* install relational structure, does that structure remain inert without grounding, or does it enable new forms of zero-shot compositional reasoning in downstream tasks?
   - Can adversarially-designed benchmarks that explicitly forbid surface shortcuts (e.g., random word order, depth-controlled syntax trees) now separate genuine grammar from pattern-matching?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

If you train AI on made-up rule-following languages, does it learn real grammar — or just get better at faking it?

Related lines of inquiry

Sources 5 notes

Papers this line draws on 8