SYNTHESIS NOTE
Psychology, Society, and Alignment Language, Text, and Discourse Reasoning, Retrieval, and Evaluation

Do language models fail reasoning tests that humans pass?

Standard critiques claim LLMs lack real reasoning ability, but do humans actually perform better on content-independent reasoning tasks? Examining whether the cognitive bar differs for artificial versus human intelligence.

Synthesis note · 2026-05-02 · sourced from Linguistics, NLP, NLU
What grounds language understanding in systems without embodiment? Do reasoning traces show how models actually think?

Lampinen et al. relitigate a fifty-year cognitive-science debate using LLM behavior as the new evidence. The classical symbolist line (Marcus, Fodor) defines abstract reasoning as content-independent: "X is bigger than Y" implies "Y is smaller than X" regardless of what X and Y are, and a system whose reasoning depends on the values of X and Y is not really reasoning. By that criterion, current LLMs fail. But the inconvenient parallel evidence Lampinen marshals is that humans fail it too — across Wason, syllogisms, and NLI, human reasoning is heavily content-sensitive in exactly the patterns LMs show.

The conclusion forks. Either the criterion is wrong, or human cognition isn't doing what the symbolist account claims it does. Lampinen leans toward the former: if humans and LMs both succeed and fail along the same content-form axis, the connectionist account where inferences are grounded in learned semantics may describe both better than the symbolist account describes either. This converges with llm semantic grounding is tri-partite — functional grounding is strong social grounding is weak causal grounding is indirect — the grounding picture is more nuanced than "absent or present," and the same nuance applies to human reasoning, just with different mixtures.

For Language as Event, this insight is load-bearing. The standard critique — "LLMs don't really reason, they just match patterns" — collapses into a parallel claim about humans: humans also don't reason in pure logical form; we reason in patterns weighted by semantic content, and we reach correct logical conclusions partly by being lucky that the content supports them. In Saussurean terms: there is no actual reasoner that operates over pure langue. Reasoning always happens in parole — in particular utterances with particular content. The content effects literature is the empirical evidence that langue/parole separation breaks at the cognitive level too, not just at the linguistic level.

The symmetry claim does not absolve LLMs of their distinctive failure modes. It does block one specific framing: "LLMs fail where humans succeed" is not what the data show. The data show: both succeed and fail along the same content-form axis. Where they diverge is elsewhere — in the override capacity, in the handling of novel structure, in the relation to grounded experience — but content-sensitivity itself is shared, and using it as the criterion for distinguishing real reasoning from fake reasoning fails the test on humans.

Inquiring lines that use this note as a source 22

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 1

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
12 direct connections · 108 in 2-hop network ·dense cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

content-independence is the wrong target — the symbolic-versus-connectionist debate dissolves once content effects are recognized as ubiquitous in both humans and LMs