Why does reasoning training help math but hurt medical tasks?
Explores whether reasoning and knowledge rely on different network mechanisms, and why training one might undermine the other across different domains.
The Decoupling Knowledge and Reasoning paper proposes a testable two-phase model of LLM inference by contrasting fast thinking (no chain-of-thought) with slow thinking (CoT-enabled). Fast thinking engages Phase 1 only: knowledge retrieval from lower network layers. Slow thinking adds Phase 2: reasoning adjustment in higher layers. Comparing the two isolates each phase's contribution.
Across 15 LLMs on 3 datasets, three findings:
Domain-specificity of reasoning benefit: Phase 2 (reasoning adjustment) helps math, physics, and chemistry but can impair performance on knowledge-intensive domains. In medical tasks, the Phase 1 knowledge retrieved may be more reliable than the Phase 2 reasoning applied on top of it — reasoning adjustment introduces error rather than correcting it.
Scaling asymmetry: parameter scaling improves both phases, but knowledge improvement (Phase 1) dominates. Larger models know more, and this knowledge advantage outpaces the reasoning advantage. Scaling makes models more "prudent" (better at not making errors) across all domains, but only "more intelligent" (better at novel inference) in reasoning-intensive ones.
Layer localization: knowledge retrieval is primarily a lower-layer phenomenon; reasoning adjustment operates in higher layers. This is a functional architectural separation — not just a behavioral one.
The layer localization provides the mechanistic explanation for the SFT knowledge gap. CoT fine-tuning and RLVR modify higher-layer behavior. They cannot improve the lower-layer knowledge encoding that knowledge-intensive tasks depend on. Adding reasoning training to a model that lacks medical knowledge won't close the knowledge gap — it modifies a layer that isn't the bottleneck.
Architectural evidence for layer redundancy: The "Unreasonable Ineffectiveness of the Deeper Layers" (2403.17887) provides striking corroboration. Up to half of LLM layers can be pruned with minimal degradation on question-answering benchmarks, using a simple strategy: identify optimal block of layers to prune by cross-layer similarity, then heal with QLoRA finetuning on a single A100 GPU. This implies either that current pretraining methods are not properly leveraging the parameters in deeper layers, or that shallow layers play a disproportionately critical role in storing knowledge. Both interpretations reinforce the functional separation: if knowledge resides in lower layers, the deeper layers' contribution may be primarily redundant refinement rather than essential computation.
Retrieval heads as mechanistic evidence: The "Retrieval Head" paper provides direct causal evidence for layer specialization. A sparse set of attention heads (<5%) are responsible for retrieving relevant information from long context. These retrieval heads are: (1) universal across model families, (2) intrinsic — they exist in short-context models and persist through context-length extension, (3) dynamically activated — some always attend to required information while others activate contextually, and (4) causal — pruning them causes hallucination while pruning non-retrieval heads has no effect. Retrieval heads strongly influence CoT reasoning (which requires referring back to prior context) but minimally affect tasks where the model generates from intrinsic knowledge. This is a specific mechanistic instantiation of the lower-layer knowledge retrieval function described above. See What mechanism enables models to retrieve from long context?.
Latent concept hierarchy: The "Discovering Latent Concepts Learned in BERT" (2205.07237) confirms the layer hierarchy from a representation perspective. Lower layers dominate in learning shallow lexical concepts, while higher layers learn semantic relations. Critically, BERT learns novel concepts (e.g., animal categories, demographic groups) that do not adhere to predefined categorizations — the model discovers its own organizational structure. Several latent concepts are based on multiple properties spanning semantics, syntax, and morphology simultaneously, suggesting the layer separation is not clean but follows a general gradient.
The "Procedural Knowledge in Pretraining Drives Reasoning" paper provides the data-level explanation that complements this architectural finding. By ranking 5 million pretraining documents by their influence on model completions, they show that reasoning draws on a diffuse set of documents containing procedural knowledge (descriptions of how to solve), while factual recall draws on narrow document sets containing the target fact. This maps directly onto the layer separation: lower layers store memorized facts (requiring document-specific exposure), while higher layers encode procedural strategies (learnable from general demonstrations of method). See Does procedural knowledge drive reasoning more than factual retrieval?.
Inquiring lines that use this note as a source 57
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
- Do integrated and decoupled architectures trade off intervention accuracy for efficiency differently?
- How do knowledge layers differ functionally from reasoning layers in networks?
- Why does content richness matter more than linguistic style in patient simulation?
- What makes reasoning capability a pre-training rather than post-training phenomenon?
- Can causal models be extended to include non-causal cognition?
- How does cognitive fit theory explain why different tasks need different knowledge structures?
- What circuit mechanisms produce belief bias in syllogistic reasoning?
- Do explicit reasoning chains improve or harm performance on complex judgment tasks?
- How much does pre-training frequency predict reasoning task performance?
- Why does training format shape reasoning strategy more than domain?
- Why does general reasoning not transfer to knowledge-intensive medical domains?
- Does explicit reasoning help or hurt tasks requiring continuous nuanced judgment?
- Does domain training degrade reasoning ability even when benchmark scores rise?
- Can reasoning skills trained on law improve performance in STEM?
- How does cross-domain reasoning transfer differ from domain-specific knowledge transfer?
- What makes symbolic operations different from general knowledge questions?
- Why do medical and mathematical tasks require fundamentally different model capabilities?
- Are detection and identification of injections truly separable in neural circuits?
- What is the difference between procedural knowledge and factual retrieval in reasoning?
- What neuroscience evidence suggests language networks are not optimized for reasoning?
- What makes knowledge-rich specialized domains structurally different from general reasoning tasks?
- Are traditional cognitive theories missing interaction effects between mechanisms?
- Can high test performance mask a complete absence of understanding?
- Can intrinsic reward signals extend beyond mathematics to medicine and law?
- What information do numerical rewards fail to provide for reasoning tasks?
- How much does training composition affect syntactic versus reasoning performance?
- How does computational split-brain syndrome differ from ordinary knowledge gaps?
- How much does training data presentation format shape reasoning ability?
- Can safety training and reasoning training be combined without losing calibration?
- Does formal reasoning training actively degrade social reasoning ability?
- What separates knowledge from reasoning in neural network layers?
- What role does curriculum design play in reasoning emergence?
- How do reasoning training methods sacrifice some thinking skills while improving others?
- Can a single correct example seed exponential improvement in mathematical reasoning?
- Does explicit reasoning help or hurt tasks requiring continuous judgment?
- What makes social reasoning fundamentally different from mathematical reasoning?
- How do knowledge and reasoning circuits interfere in the same neural network?
- How should safety training and reasoning training balance abstention differently?
- Which application domains like healthcare and education lack alignment research?
- What makes reasoning auditable in medical AI decision support?
- Can reasoning scaffolds help with nuanced judgment tasks like empathy?
- Why might social reasoning work differently than formal logical reasoning?
- Does reasoning training actively undermine the abstention capacity safety training created?
- Why does reasoning training improve math but hurt knowledge tasks?
- Why does contextual judgment matter more in law and medicine than in mathematics?
- Do interaction effects between research mechanisms depend on the task domain?
- What is the distinction between teaching reasoning how versus when to activate?
- Can pretraining signals unlock latent reasoning that post-training merely activates?
- What distinguishes reasoning activation mechanisms across different training methods?
- Does training data format shape reasoning strategy more than domain content?
- How does the knowing-doing gap relate to Potemkin understanding?
- Can mathematical reasoning improvements transfer across problem subdomains?
- Why does reasoning transfer across different numbers but factual recall does not?
- Why do higher network layers capture procedural knowledge but lower layers store facts?
- Why do knowledge and reasoning train in different network layers?
- Is reasoning failure caused by task complexity or training distribution gaps?
- Can articulating latent reasoning processes improve transfer across domains?
Related concepts in this collection 4
This note in its neighbourhood — explore the map, then jump to a related concept in the list below.
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
-
Does medical AI need knowledge or reasoning more?
Medical and mathematical domains may require fundamentally different AI training priorities. If medical accuracy depends primarily on factual knowledge while math depends on reasoning quality, should we build and evaluate these systems differently?
layer localization is the mechanistic explanation for the behavioral pattern this note documents
-
Why doesn't mathematical reasoning transfer to medicine?
Can models trained to reason well about math apply those skills to medical domains through fine-tuning? This explores whether reasoning ability is truly domain-agnostic or constrained by domain-specific knowledge requirements.
transfer fails because SFT modifies higher-layer reasoning while the bottleneck is lower-layer knowledge; this paper makes that precise
-
Do language models actually use their encoded knowledge?
Probes can detect that LMs encode facts internally, but do those encoded facts causally influence what the model generates? This explores the gap between knowing and doing.
layer localization explains the encoding-generation gap: knowledge in lower layers may be overridden by higher-layer reasoning adjustments that introduce error, producing the failure mode where the model "knows" the answer but generates an incorrect one
-
Can text-trained models compress images better than specialized tools?
Do general-purpose language models trained only on text outperform domain-specific compressors like PNG and FLAC on their native data? This tests whether compression ability is universal or requires domain specialization.
the compression framing maps onto the layer separation: lower layers compress facts (document-specific memorization), higher layers compress procedures (generalizable reasoning); the scaling caveat on adjusted compression may reflect redundancy in deeper layers
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- Decoupling Knowledge and Reasoning in LLMs: An Exploration Using Cognitive Dual-System Theory
- Knowledge or Reasoning? A Close Look at How LLMs Think Across Domains
- The Incomplete Bridge: How AI Research (Mis)Engages with Psychology
- Reasoning Circuits in Language Models: A Mechanistic Interpretation of Syllogistic Inference
- LLMs can implicitly learn from mistakes in-context
- Eliciting Reasoning in Language Models with Cognitive Tools
- Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens
- Medical Reasoning in the Era of LLMs: A Systematic Review of Enhancement Techniques and Applications
Original note title
knowledge resides in lower network layers and reasoning in higher layers — this functional separation explains why reasoning training helps math but can impair knowledge-intensive domains