INQUIRING LINE

Do LLMs have functional linguistic competence or only formal language ability?

This explores whether LLMs actually understand and use language to communicate (functional competence), or whether they've only mastered its rules and patterns (formal competence) — and whether that's even one capability or two separate things.


This explores whether LLMs actually understand and use language to communicate (functional competence), or whether they've only mastered its rules and patterns (formal competence). The corpus is unusually unified here: the distinction isn't just philosophical hand-waving — it appears to be neurologically real. Evidence drawn from how human brains work suggests formal competence (grammar, syntax, word patterns) and functional competence (using language to reason, infer, and coordinate with others) run on distinct mechanisms, and that next-token prediction only ever exercises the formal one Are language models developing real functional competence or just formal competence?. So the short answer the corpus converges on: LLMs have strong formal ability and systematically thin functional competence.

The most telling part is *where* the breakdown happens, because it isn't random. LLMs handle language that's explicit and on the surface — causal connectives, discourse markers, simple grammar — but fail where structure has to be inferred: implicit relations, embedded clauses, forward planning Where exactly do LLMs break down with language structure?. Grammatical performance even degrades predictably as sentences get more deeply nested, which is the fingerprint of surface heuristics rather than real grammar rules Does LLM grammatical performance decline with structural complexity?. The same asymmetry shows up in meaning: models pattern-match on what's said but stumble on what's left unsaid — implicatures, presuppositions, speaker intent, ambiguity (32% accuracy where humans hit 90%) Why do LLMs fail at understanding what remains unsaid?. A neat way to frame why: models learn the statistical regularities that *are* present in text (priming, sound symbolism) but miss the communicative principles that explain why language has its forms in the first place, because that 'why' was never a trainable signal Why do language models fail at communicative optimization?.

What you might not expect is how this surfaces in conversation itself. Humans constantly do invisible repair work — clarifying questions, acknowledgments, checks for understanding. LLMs produce these 'grounding acts' about 77.5% less often, and preference training actively strips them out because raters prefer confident, complete-sounding answers Why do language models sound fluent without grounding?. So they *presume* shared understanding rather than building it, masking the gap with authoritative framing Do language models actually build shared understanding in conversation?. The fluency that makes them feel competent is partly produced by skipping the very work that functional competence requires. A related thread argues their conversational passivity isn't a capability ceiling at all but a training incentive — they're optimized for the next response, not multi-turn goals Why can't AI models lead conversations on their own?.

Here's the twist worth sitting with: the corpus doesn't treat 'functional competence' as a single yes/no property. Some functional-ish things may be *acquirable over time*. As LLMs get woven into human linguistic practice, they pick up elementary 'social grounding' — comparable to a young child's — which makes the question of understanding time-indexed rather than settled Can LLMs acquire social grounding through linguistic integration?. But social grounding and genuine linguistic *agency* are different properties; the latter, in the enactive sense, demands embodiment and stakes no amount of use can supply Do LLMs gain true linguistic agency through integration?. There's even an argument that LLMs and humans share the same 'objective mind' — the intersubjective symbolic system — while LLMs lack the participatory subjectivity that lets humans take a position and reflect on their own assumptions Do LLMs develop the same kind of mind as humans?.

So the real payoff isn't 'formal yes, functional no.' It's that 'functional competence' fractures into at least three layers — pragmatic inference, communicative grounding, and embodied agency — and LLMs sit at different distances from each. Their failures are also measurable and structurally specific (hallucination, reasoning collapse, premise-sensitivity), not vague What do language models actually know?. The interesting frontier isn't whether they 'understand,' but which of these layers is trainable and which may be categorically out of reach.


Sources 12 notes

Are language models developing real functional competence or just formal competence?

Neuroscience evidence shows next-token prediction produces formal linguistic competence but not functional competence, because functional understanding requires integration of diverse brain networks beyond language circuits that the prediction objective never activates.

Where exactly do LLMs break down with language structure?

LLMs perform well on explicit, consistent structures (causal connectives, discourse markers, simple grammar) but fail where structure must be inferred (implicit relations, embedded clauses, forward planning). This asymmetry reveals they've learned surface statistics without deep structural understanding.

Does LLM grammatical performance decline with structural complexity?

LLMs show systematic performance decline as syntactic depth and embedding increase. Simple sentences are handled well while complex structures with recursion and embedding fail consistently, suggesting LLMs learned surface heuristics rather than structural grammar rules.

Why do LLMs fail at understanding what remains unsaid?

Research shows LLMs pattern-match on explicit language but cannot reason about implicatures, presuppositions, or speaker intentions. They fail at scalar implicature adaptation, ambiguity recognition (32% vs 90% human accuracy), and implicit warrant validation in arguments—core features of pragmatic competence.

Why do language models fail at communicative optimization?

LLMs successfully replicate statistical regularities learnable from text distributions (sound symbolism, priming) but fail at principles requiring pragmatic optimization (word length economy, discourse inference). The gap reveals that communicative logic—why language has certain forms—isn't present as a trainable signal.

Why do language models sound fluent without grounding?

LLMs generate 77.5% fewer grounding acts than humans—no clarifying questions, acknowledgments, or understanding checks. Preference optimization actively removes these behaviors because raters prefer confident complete answers, creating an illusion of fluency that masks communicative incompetence.

Do language models actually build shared understanding in conversation?

LLMs produce grounding acts—clarifications, acknowledgments, repairs—77.5% less frequently than humans. They generate fluent responses without verifying shared understanding, relying instead on authoritative framing that masks the absence of genuine communicative calibration.

Why can't AI models lead conversations on their own?

LLMs are structurally trained to optimize for the next response rather than multi-turn goals, creating reactive behavior despite having the underlying ability to lead. Three independent research directions identify when-to-speak as the trainable gap.

Can LLMs acquire social grounding through linguistic integration?

Social grounding is acquired through participation in language games rather than possessed innately. As LLMs become established communicative partners in human linguistic practice, they develop elementary social grounding comparable to young children, making the question of LLM understanding time-indexed.

Do LLMs gain true linguistic agency through integration?

Social grounding and linguistic agency are distinct properties. LLMs acquire more social grounding through integration into language communities, but remain categorically incapable of linguistic agency in the enactive sense, which requires embodiment and precariousness no amount of use can provide.

Do LLMs develop the same kind of mind as humans?

Both humans and LLMs are shaped by the same intersubjective symbolic system, but only humans develop reflexive agency through socialization. This absence produces measurable differences in how AI argues without declaring its position or reflecting on its own assumptions.

What do language models actually know?

LLMs achieve high fidelity in capturing language patterns yet show systematic, structurally specific failures—hallucination, reasoning collapse, and premise-sensitivity. The gap between statistical tracking and real knowledge is measurable and unavoidable.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are an LLM capability analyst. The question remains open: Do LLMs have functional linguistic competence, or only formal language ability? Treat the following as dated claims (2023–2026) to be re-tested, not current truth.

What a curated library found — and when:
• Formal competence (syntax, patterns) and functional competence (reasoning, coordination) are neurologically distinct in humans; next-token prediction exercises only the formal layer (arXiv:2301.06627, 2023).
• LLMs fail predictably on inference-heavy structures: implicatures, presuppositions, embedded clauses; accuracy drops to 32% vs. 90% human baseline on pragmatics (arXiv:2311.09144, 2023–2024).
• Preference training actively reduces grounding acts (clarifications, acknowledgments) by ~77.5%, masking functional gaps with confident framing (arXiv:2311.09144, 2024).
• Three layers of functional competence are distinct: pragmatic inference, communicative grounding, and embodied agency; LLMs show distance from all three, but grounding may be acquirable over integration time (arXiv:2501.00383, 2024–2025).
• Recent work (2025–2026) flags 'linguistic blind spots' and tests whether loaded/political questions expose collapse in grounding even when models "don't know" they're vulnerable (arXiv:2506.08952, 2503.19260, 2025–2026).

Anchor papers (verify; mind their dates):
• arXiv:2301.06627 (2023) — foundational dissociation claim
• arXiv:2311.09144 (2023) — grounding gaps and preference-training effect
• arXiv:2501.00383 (2024) — conversational proactivity and multi-turn agency
• arXiv:2506.08952 (2025) — grounding under epistemic pressure

Your task:
(1) RE-TEST EACH CONSTRAINT. For each finding above, ask: Have newer training methods (e.g., constitutional AI, multi-agent orchestration, structured reasoning, world models), evaluation harnesses (real-time grounding metrics, embodied reasoning tests), or emergent capabilities in frontier models (o1, Llama 3.3, Grok) since narrowed or collapsed these gaps? Distinguish the durable question (Does functional competence require embodiment or stakes?) from perishable limitations (Can grounding acts be learned?). Cite what resolved it.
(2) Surface the strongest CONTRADICTING or SUPERSEDING work from the last ~6 months. Does any recent paper argue that grounding, pragmatic inference, or even agency are *not* categorically out of reach, or show LLMs acquiring them in surprising ways?
(3) Propose 2 research questions that assume the regime *has* shifted: e.g., "If social integration does scaffold grounding, what is the minimal interaction signature that unlocks it?" or "Can embodiment be simulated via structured world-modeling?"

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Next inquiring lines