Are language models developing real functional competence or just formal competence?
Neuroscience suggests formal linguistic competence (rules and patterns) and functional competence (real-world understanding) rely on different brain mechanisms. Can next-token prediction alone produce both, or does it leave functional competence behind?
Fedorenko and colleagues (Dissociating language and thought) ground the LLM competence debate in neuroscience. Formal linguistic competence — knowledge of linguistic rules and patterns, grammatical structure, syntactic regularities — relies on dedicated language circuits in the brain. Functional linguistic competence — understanding and using language in the world — requires integration of diverse brain networks beyond language circuits: memory, reasoning, social cognition, sensorimotor systems.
The critical finding: word-in-context prediction, the training objective of most LLMs, produces formal competence as an emergent outcome. It does not and cannot produce functional competence, because functional competence requires the integration of systems that are architecturally distinct in the brain and not activated by the prediction objective.
LLMs are "qualitatively different in their formal linguistic capacities from models before roughly 2018" — a genuine discontinuity in formal competence. But this formal competence arises from an objective that leaves functional competence behind. The two competences are not on a continuum; they are served by different mechanisms.
The predictive implication is architectural. Models that succeed at real-life language use will need to mimic the division of labor between formal and functional competence in the human brain — through modularity: separate circuits for form-level processing and for world-connected functional processing. LLMs that add retrieval, tool use, and memory may be approximating this modularity, but from the outside rather than by design.
This is distinct from Bender & Koller's claim that meaning cannot be acquired from form alone (which rests on the joint-attention/communicative-intent argument). The Fedorenko finding adds a mechanistic neuroscience foundation: even if we grant that some meaning can emerge from distributional learning, the kind of competence that requires world integration is neurologically segregated and cannot be produced by the same mechanism as syntactic pattern-learning.
Inquiring lines that use this note as a source 11
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
- Can you separate grammatical competence from rhetorical commitment in language systems?
- How does enactive theory define language differently than computational linguistics?
- Does next-token prediction alone produce genuine functional language competence?
- Does embodiment and interaction matter for linguistic competence beyond pattern learning?
- What architectural changes would let language models develop genuine functional competence?
- Do LLMs have functional linguistic competence or only formal language ability?
- What role does language play as a cognitive scaffold versus communication tool?
- What cognitive abilities distinguish metalinguistic analysis from language use?
- What's the difference between formal and functional linguistic competence?
- What distinguishes communicative competence from human-like dialogue ability?
- Does functional integration determine cognitive system boundaries?
Related concepts in this collection 3
This note in its neighbourhood — explore the map, then jump to a related concept in the list below.
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
-
Can language models learn meaning from text patterns alone?
Explores whether training on form alone—predicting the next word from prior words—could ever give language models access to communicative intent and genuine semantic understanding.
shares the formal/functional gap; Fedorenko adds neuroscience mechanism; Bender/Koller add communicative-intent argument
-
What makes linguistic agency impossible for language models?
From an enactive perspective, does linguistic agency require embodied participation and real stakes that LLMs fundamentally lack? This matters because it challenges whether LLMs can truly engage in language or only generate text.
enactive view is a third account of why functional competence requires more than formal pattern-learning
-
Why does ChatGPT fail at implicit discourse relations?
ChatGPT excels when discourse connectives are present but drops to 24% accuracy without them. What does this gap reveal about how LLMs actually process meaning and logical relationships?
behavioral evidence for formal/functional gap: explicit connectives (formal cues) work; implicit relations (functional understanding) fail
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- Dissociating language and thought in large language models
- Are Emergent Abilities in Large Language Models just In-Context Learning?
- Training Large Language Models to Reason in a Continuous Latent Space
- Beyond Accuracy: Evaluating the Reasoning Behavior of Large Language Models -- A Survey
- Insert-expansions For Tool-enabled Conversational Agents
- Comprehension Without Competence: Architectural Limits of LLMs in Symbolic Computation and Reasoning
- What does it mean to understand language?
- Large Linguistic Models: Investigating LLMs' metalinguistic abilities
Original note title
formal and functional linguistic competence are neurologically distinct — next-token prediction produces formal competence but not functional competence