SYNTHESIS NOTE
Psychology, Society, and Alignment Language, Text, and Discourse

Can language models learn meaning from text patterns alone?

Explores whether training on form alone—predicting the next word from prior words—could ever give language models access to communicative intent and genuine semantic understanding.

Synthesis note · 2026-02-21 · sourced from Linguistics, NLP, NLU
What kind of thing is an LLM really? How should researchers navigate LLM reasoning research?

Bender & Koller (2020) make a specific structural argument, not just an intuitive one. Meaning is defined as the relation M ⊆ E × I — pairs of natural language expressions and the communicative intents they can be used to evoke. Understanding language means retrieving i given e. But communicative intents are about something outside of language. Form alone — marks on a page, pixels, bytes — is insufficient.

The reasoning: without access to a mechanism for hypothesizing and testing underlying communicative intents, reconstructing them from form alone is impossible. Language modeling predicts the next token given prior tokens — purely a form-to-form operation. The training signal provides no information about what intents the forms were used to evoke.

Human language acquisition illustrates the point by contrast. What is critical for meaning acquisition is not just interaction but joint attention — situations where child and caregiver both attend to the same thing and are both aware of this fact. Learning meaning requires the ability to be aware of what another person is attending to and guess what they are intending to communicate. Intersubjectivity is not incidental to language learning; it is its mechanism.

The Harnad formulation (symbol grounding problem): a non-speaker of Chinese cannot learn the meanings of Chinese words from Chinese dictionary definitions alone. You need something outside the symbol system to anchor the symbols. Form-to-form prediction cannot provide this anchor.

Mutual understanding is structurally unavailable — even in conversational media. The form-only training constraint has a downstream consequence that applies even when AI operates in conversational channels: seeking mutual understanding with the user is structurally unavailable to an LLM because mutual understanding requires the intersubjectivity that form-training cannot provide. The communication is one-way even when it occurs on a medium designed for mediated social interaction. This reframes AI social-media posts as a specific genre: indirect discourse that is a form of writing even when it appears in an interactive environment. The user reads the post, the medium formally supports reply, but the AI is not available for the second turn that would close a loop of mutual understanding — and was never going to be. The channel looks communicative; the content is monological writing that happens to be deposited in a conversational shape.

This is distinct from the claim that LLMs "have no understanding." It is the more precise claim that the training mechanism — string prediction — is in principle incapable of providing the signal that meaning acquisition requires, regardless of scale.

Inquiring lines that use this note as a source 58

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 3

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
17 direct connections · 131 in 2-hop network ·medium cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

language models trained on form alone cannot acquire meaning because meaning requires joint attention and intersubjectivity