INQUIRING LINE

Can we use LLM language without adopting LLM assumptions?

This explores whether we can take in the words LLMs produce without quietly inheriting the hidden premises baked into how they generate them — that language is a complete, extractable thing and that a model's strings mean what a human's would.


This reads the question as being about a kind of contamination: when we use LLM output, do we also absorb the assumptions the machinery rests on? The corpus suggests the danger is real, and naming the assumptions is the first defense. The clearest statement is that LLM engineering rests on two premises enactive linguists reject outright — that language is a complete, stable object you can extract from text, and that a dataset can fully capture it What hidden assumptions drive how we build language models?. Language, on this view, is a practice you participate in, not a substance you harvest. So the moment you treat the model's fluent strings as if they were 'language' in the full human sense, you've already adopted the completeness assumption without noticing.

The sharpest separation comes from work arguing that LLM text generation and human communication are structurally different operations that merely share a surface form Are language models and human speakers doing the same thing?. Humans use language to address and relate to someone; an LLM emits strings from a probability distribution. They look alike, but what produces them, what they do socially, and what a receiver should do with them all differ. The answer to your question lives right here: you can use LLM language without adopting LLM assumptions only if you keep that distinction live — reading the output as a generated artifact to be re-grounded, not as a speaker's utterance to be taken on trust.

Why the assumptions are sticky rather than harmless shows up in the failure modes. Models reason through semantic association, not symbolic logic, so their fluency collapses the moment meaning is decoupled from the task Do large language models reason symbolically or semantically?. They can state a principle correctly and then fail to apply it — a split between explanation and execution that has no human analog Can language models understand without actually executing correctly?, Can LLMs understand concepts they cannot apply?. If you assume the words carry the competence they imply, you inherit a false model of what's behind them.

There's also a quieter assumption about how meaning gets settled. LLMs operate in 'static grounding' — they retrieve and respond as if common ground already exists — while human communication builds it through clarification and repair Why do language models skip the calibration step?. Using LLM language responsibly means supplying the dynamic grounding the system skips: treating its output as a first draft that needs the calibration loop, not a finished claim. Practically, that's the move that lets you keep the language and drop the assumption.

The encouraging counterweight is that the gap is partly addressable through how you engage rather than what the model is. Forcing explicit enumeration of the unstated preconditions an LLM glosses over lifts accuracy dramatically Do language models fail at identifying unstated preconditions?, and o1-style step-by-step reasoning can produce genuine metalinguistic analysis rather than mere performance Can language models actually analyze language structure?. So the answer is a qualified yes: you can use LLM language without adopting LLM assumptions, but only if you actively name the completeness-and-communication premises it ships with and supply the grounding, verification, and precondition-checking the model leaves out — the words are usable, the assumptions are optional, but only if you do the work to detach them.


Sources 8 notes

What hidden assumptions drive how we build language models?

LLMs assume language is a complete stable thing extractable from text data. Enactive linguistics rejects both: language is a practice requiring embodied participation, and no dataset can capture its radical incompleteness and responsiveness.

Are language models and human speakers doing the same thing?

LLMs produce strings via probability distributions; humans use language to address and relate to others. They share surface form but differ in what produces output, what it does socially, and what receivers should do with it.

Do large language models reason symbolically or semantically?

When semantic content is decoupled from reasoning tasks, LLM performance collapses even with correct rules in context. Models rely on parametric commonsense and token associations rather than formal logical manipulation, constraining reasoning to training distribution semantics.

Can language models understand without actually executing correctly?

Large language models can articulate correct principles but systematically fail to apply them due to dissociated instruction and execution pathways. The 87% accuracy in explanations versus 64% in actions reveals this is not knowledge deficit but structural disconnect.

Can LLMs understand concepts they cannot apply?

Models can explain concepts accurately, fail to apply them, and recognize the failure—a triple pattern incompatible with human cognition. This indicates functionally disconnected explanation and execution pathways rather than simple knowledge gaps.

Why do language models skip the calibration step?

LLMs operate in static grounding mode—retrieving data and responding without clarification loops. Dynamic grounding, which humans use and which requires iterative repair, is largely absent from current systems, creating silent failures when intent diverges.

Do language models fail at identifying unstated preconditions?

LLMs struggle not from lacking world knowledge but from failing to bring background conditions forward as relevant constraints. Prompting that forces explicit enumeration of preconditions raises accuracy from 30% to 85%, revealing the frame problem persists in statistical systems.

Can language models actually analyze language structure?

OpenAI's o1 model successfully constructs syntactic trees and phonological generalizations through explicit step-by-step reasoning, revealing that LLM linguistic capability extends far beyond behavioral language tasks to genuine language analysis.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a researcher evaluating whether the contamination risk of LLM assumptions has shifted. The core question: can we use LLM language without adopting LLM assumptions? Treat the following as dated claims (spanning 2023–2026) to be re-tested against current models and practices.

What a curated library found — and when (dated claims, not current truth):
• Language completeness and data completeness are unstated premises in LLM engineering; users absorb them without noticing (2024–25).
• LLMs reason through semantic association, not symbolic logic; fluency collapses when meaning decouples from task (2023).
• Models exhibit "comprehension without competence"—stating a principle correctly but failing to apply it—a split with no human analog (2025–26).
• LLMs operate in static grounding (assume common ground exists) rather than dynamic grounding (build it through repair); users must supply the missing calibration loop (2024–25).
• Step-by-step reasoning and explicit precondition enumeration can shift models toward genuine metalinguistic analysis rather than mere performance (2023, 2025).

Anchor papers (verify; mind their dates):
• arXiv:2407.08790 (2024-07): Large Models of What?—the engineering-vs-agency distinction.
• arXiv:2507.10624 (2025-07): Comprehension Without Competence—failure mode taxonomy.
• arXiv:2506.08952 (2025-06): Can LLMs Ground—grounding under uncertainty.
• arXiv:2602.06176 (2026-02): Large Language Model Reasoning Failures—recent failure catalog.

Your task:
(1) RE-TEST EACH CONSTRAINT. For each claim above, judge whether post-2026 models (o3, Claude-4+, specialized reasoning chains), new training methods (constitutional AI, synthetic data curation), or orchestration (multi-step verification, human-in-loop harnesses) have relaxed or overturned the limitation. Separate the durable question (still open) from perishable limitation (possibly resolved). Cite what resolved it, and say plainly where constraints still hold.
(2) Surface the strongest CONTRADICTING or SUPERSEDING work from the last ~6 months—especially any claiming LLMs now do genuine symbolic reasoning, dynamic grounding, or competence-matching-explanation.
(3) Propose 2 research questions that ASSUME the regime has shifted: e.g., "If step-by-step reasoning now reliably decouples explanation from execution, how should we redesign evaluation to detect when it hasn't?"

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Next inquiring lines