Does language convey meaning purely through relational structure without external grounding?
This explores whether meaning lives entirely inside the web of relationships between words — the way they pattern against each other — or whether it also needs some anchor outside language, in the world, the body, or shared human intention.
This explores whether meaning lives entirely inside the relationships between words, or whether it also needs an anchor outside language. The corpus is split — and the split is the interesting part. On one side, large language models are a working proof that fluent, culturally-fluent language can be generated from relational structure alone. One line of work argues LLMs essentially operationalize Saussure's *langue* — the idea that meaning comes from differences between signs — by compressing relational patterns out of text with no external referents at all Can language models learn meaning without engaging the world?. The geometry backs this up: models spontaneously encode syntactic relations as structured positions in their activation space, suggesting real relational meaning gets built without anyone wiring it in How do language models encode syntactic relations geometrically?.
But the same corpus pushes back hard, and it does so by refusing the word "purely." A strong counter-position holds that meaning *requires* the relation between expressions and communicative intent — and since models only ever see form-to-form patterns with no access to shared attention, they can't reconstruct the grounding that meaning needs Can language models learn meaning from text patterns alone?. There's empirical fuel here too: models systematically prefer higher-frequency surface phrasings over semantically identical rare ones, which looks more like tracking statistical mass than recognizing meaning Do language models really understand meaning or just surface frequency?.
The most useful move the collection makes is to dissolve the yes/no framing entirely. Grounding isn't binary — it comes in degrees and kinds. One framework splits it three ways: *functional* grounding (strong in LLMs), *social* grounding (weak, but growing through human interaction), and *causal* grounding (indirect, mediated through text) Does semantic grounding in language models come in degrees?. So the answer to "purely relational?" becomes: relationally strong, externally thin, but not zero What grounds language understanding in systems without embodiment?. Even the missing external anchor turns out to be partially recoverable — models extract structured world-representations from data that *was* produced by causally grounded humans, inheriting a secondhand, gappy contact with reality Can large language models develop genuine world models without direct environmental contact?. And when you give a system live external feedback — interleaving reasoning with real tool queries — hallucination drops sharply, which is direct evidence that grounding adds something relational structure alone doesn't supply Can interleaving reasoning with real-world feedback prevent hallucination?.
Here's the thing you might not have known you wanted to know: the deepest version of this question isn't about reference to objects at all — it's about whether meaning needs a *subject*. Several notes converge on the idea that subjecthood isn't something language expresses from the outside; it's produced *within* communicative events, through accountability and an evaluative stance toward what's said Does language create subjects or express them?. By that lens, a system can produce perfectly relational, contextually appropriate text and still miss the relational-normative conditions of genuine communication — a puppet shaped like a walker that never walks Does behavioral speech output prove communicative subjecthood?. So the grounding language might most need isn't a tether to the physical world — it's a tether to the social, accountable relationships in which meaning gets used Can we defend modest mental attributions to large language models?.
Sources 11 notes
Research shows LLMs learn culturally situated discourse patterns by compressing relational structure from text, demonstrating that fluent language generation requires no external referents or embodied grounding.
The Polar Probe shows LLMs represent syntactic type and direction through both distance and angular position between embeddings, nearly doubling accuracy over distance-only methods. This demonstrates neural networks spontaneously learn structured, symbolic-compatible geometry.
Bender & Koller argue that meaning requires the relation between expressions and communicative intents. Since LLMs are trained only on form-to-form prediction with no access to shared attention or intent, they cannot reconstruct the meaning that grounds language.
LLMs show consistent preference for higher-frequency surface forms over semantically equivalent rare paraphrases across math, machine translation, commonsense reasoning, and tool calling. This suggests models track statistical mass from pretraining rather than meaning-recognition as their primary mechanism.
Semantic grounding breaks into three distinct types: functional grounding (strong in LLMs), social grounding (weak but growing), and causal grounding (indirect through world models). LLMs score differently on each dimension, making the yes-or-no understanding question misleading.
Language models achieve functional grounding through relational language patterns but lack social grounding through participatory agency and causal grounding through embodied environmental contact. Social grounding can increase through human integration, but linguistic agency requires architectural changes beyond training.
LLMs form structured world representations by extracting regularities from training data produced by causally grounded humans. This constitutes indirect causal grounding mediated through text, though the chain has gaps that limit real-time verification and model updating.
ReAct demonstrates that alternating verbal reasoning with external tool queries (Wikipedia API, environment interaction) prevents error propagation by injecting real-world feedback at each step. On knowledge-intensive and interactive tasks, this approach outperforms pure chain-of-thought and reinforcement learning by 10-34% absolute accuracy.
Subjecthood is produced within communicative events, not possessed prior to them. This convergent position across philosophy, linguistics, and cognitive science inverts the standard picture of language as a tool used by pre-existing subjects.
Chalmers' test passes any system producing contextually appropriate text, but communicative subjecthood requires relational-normative conditions like accountability and evaluative stance. The test is calibrated to the wrong phenomenon, creating false positives like puppets that walk-shaped without walking.
Both robustness and etiological deflationist arguments beg the question against inflationism. A graded approach ascribing metaphysically undemanding states like beliefs and desires—while withholding consciousness claims—mirrors how we treat non-human animals.