Does next-token prediction alone produce genuine functional language competence?
This explores whether training a model purely to predict the next token can yield real language understanding — or only the fluent appearance of it.
This explores whether next-token prediction is enough to produce genuine language competence, or only a convincing surface of it — and the corpus splits along a clean fault line that's worth seeing directly. The sharpest "no" comes from a distinction borrowed from neuroscience: there's a difference between *formal* competence (knowing what's grammatical and well-formed) and *functional* competence (using language to reason, plan, and act in the world). One line of work argues these run on neurologically distinct machinery, and that next-token prediction trains the formal circuit beautifully while never activating the broader networks that functional understanding requires Are language models developing real functional competence or just formal competence?. A parallel philosophical argument says the same thing about meaning: if you only ever see form predicting form, with no access to the shared attention and communicative intent that ground language for humans, you can't reconstruct what the words actually *mean* Can language models learn meaning from text patterns alone?.
But the corpus contains a genuine rebuttal, not just agreement. Another note leans on Saussure's idea of *langue* — language as a self-contained web of relations among signs — to argue that fluent, culturally situated language emerges from compressing relational structure alone, with no external referents and no embodied grounding required Can language models learn meaning without engaging the world?. Read together, the two camps may not even disagree on the facts so much as on the word "competence": form-only training clearly captures the relational system of language, and the open question is whether that system *is* the competence or merely its skeleton.
Where it gets interesting is the mechanistic middle ground. If you treat a model as fundamentally an autoregressive probability machine, you can predict exactly where it breaks — tasks with low-probability target answers (counting letters, reversing the alphabet) stay hard even when they're logically trivial, because the objective rewards likelihood, not correctness Can we predict where language models will fail?. That's the signature of formal-without-functional competence showing through. Yet the same prediction objective hides more than it appears to: probing internal layers shows models computing correct answers in early layers and then *overwriting* them to produce format-compliant filler Do transformers hide reasoning before producing filler tokens? — suggesting the surface token stream undersells what the network actually does.
The most provocative thread argues the limit isn't in next-token prediction itself but in how we've been using it. Reframe the objective so each predicted token earns a *verifiable* reward from the corpus, and next-token prediction becomes a reasoning task rather than a mimicry task Can next-token prediction become a reasoning task with RL?. A related finding shows the real learning signal lives in a small minority of high-entropy "forking" tokens — the decision points — not in the bulk of predictable filler Do high-entropy tokens drive reasoning model improvements?. Together these hint that the gap between formal and functional competence may be less about the prediction objective and more about which predictions you train hardest on.
So the honest synthesis: the corpus says next-token prediction *reliably* produces formal competence and fluent relational language, but the evidence that it produces functional competence on its own is contested and, on the neuroscience-grounded reading, negative. What closes the gap in this collection is never plain prediction — it's verifiable reward, internal reasoning that the surface tokens conceal, or self-generated feedback signals layered on top. The thing you didn't know you wanted to know: the disagreement is partly definitional, and the most rigorous critiques and the most optimistic results are pointing at the same small set of high-stakes tokens where reasoning actually happens.
Sources 7 notes
Neuroscience evidence shows next-token prediction produces formal linguistic competence but not functional competence, because functional understanding requires integration of diverse brain networks beyond language circuits that the prediction objective never activates.
Bender & Koller argue that meaning requires the relation between expressions and communicative intents. Since LLMs are trained only on form-to-form prediction with no access to shared attention or intent, they cannot reconstruct the meaning that grounds language.
Research shows LLMs learn culturally situated discourse patterns by compressing relational structure from text, demonstrating that fluent language generation requires no external referents or embodied grounding.
By framing LLMs as autoregressive probability machines, researchers predicted tasks with low-probability target responses would be systematically harder, even when logically simple. Experiments confirmed predictions like backwards alphabet and letter counting.
Logit lens analysis shows models trained with hidden CoT tokens compute correct answers in layers 1-3, then actively suppress these representations in final layers to produce format-compliant filler output. The reasoning is fully recoverable from lower-ranked token predictions.
Reinforcement Pre-Training transforms next-token prediction into a reasoning task by providing verifiable rewards from the corpus itself, eliminating reward hacking and enabling inference-time scaling during pretraining. This suggests token-level reasoning patterns during pretraining strengthen downstream RL fine-tuning.
Only ~20% of tokens exhibit high entropy as pivotal reasoning decision points; RLVR primarily adjusts these forking tokens. Training exclusively on them matches or exceeds full-gradient performance, revealing that the minority carries the learning signal.