Why do language models fail to use knowledge they possess?
Large language models contain relevant world knowledge but often fail to activate it without explicit cues. This explores whether the bottleneck lies in knowledge storage or in the inference process that decides what background facts apply.
The Heuristic Override Benchmark study includes an explicitness gradient: same problem, varying the salience of the cue that should activate the constraint. The result decisively locates the failure. Adding a single subtle hint — for example emphasizing the key object that must be present ("get my car washed") — raises accuracy +15.3 percentage points on average across 14 models, from 59.2 percent to 74.5 percent.
This is diagnostic. If models lacked the relevant world knowledge, no amount of surface emphasis would help. They have the knowledge. The problem is that the knowledge does not get activated unless the prompt cues it explicitly. The bottleneck is not in storage, retrieval capacity, or chain-of-thought depth. It is in the inferential step that decides which background facts are relevant to the current decision.
Goal-decomposition prompting tells the same story from another angle. Forcing the model to enumerate preconditions before answering — to ask "what must be true for walking to be the right choice here?" — recovers 6 to 9 percentage points. The intervention works because it converts the implicit constraint into a self-generated explicit hint. The model can do the reasoning when it is forced to enumerate; it cannot reliably trigger the enumeration on its own when a salient surface heuristic is competing for attention.
The implication for deployment is unsettling. Standard prompts do not activate the relevant knowledge. The fact that the knowledge is present in the model does not mean it will be brought to bear on the decision the user actually needs the model to make. Knowledge possession and knowledge activation are decoupled.
Inquiring lines that use this note as a source 8
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
- When does knowledge activation fail across different model architectures?
- How much does memorization capacity limit a model's ability to learn new information?
- How do we distinguish knowledge encoding from knowledge usage in models?
- When does encoded knowledge fail to influence language model generation?
- Why might encoded world knowledge fail to actually influence language model outputs?
- How can a model explain something correctly yet fail to apply it?
- Do models verbalize their implicit knowledge when that knowledge influences their output?
- What makes a model fail to activate relevant skills from its own harness?
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- The Model Says Walk: How Surface Heuristics Override Implicit Constraints in LLM Reasoning
- Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs
- Beyond Accuracy: Evaluating the Reasoning Behavior of Large Language Models -- A Survey
- Don't Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM Collaboration
- Premise Order Matters in Reasoning with Large Language Models
- LLMs can implicitly learn from mistakes in-context
- Context Embeddings for Efficient Answer Generation in RAG
- Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models
Original note title
LLM reasoning failure on implicit constraints is an inference bottleneck — relevant knowledge is present but fails to activate without explicit cuing