How do thought anchors differ from individual forking tokens mechanistically?
This explores the difference between two ways of pinpointing 'what matters' inside a reasoning chain: thought anchors (sentence-level steps that steer everything downstream of them) versus forking tokens (single tokens at branch points where the chain could swing one way or another) — and the corpus speaks to the territory more than to those exact terms.
This explores the difference between two units of analysis for reasoning chains: the sentence-sized 'anchor' that organizes everything after it, versus the single decision-point token that tips the chain onto one branch. The collection doesn't carry the specific papers that coined either term, but it has a surprising amount on the underlying mechanics, and the cleanest way to see the distinction is through what scale of influence each one exerts.
The token-level view is well-represented. One thread finds that specific tokens — words like 'Wait' and 'Therefore' — are sharp peaks of mutual information with the correct answer: suppress them and reasoning degrades, suppress an equal number of random tokens and nothing happens Do reflection tokens carry more information about correct answers?. That's the forking-token intuition made concrete: influence is concentrated in a sparse handful of tokens that act as switches. A complementary study shows models internally rank tokens by functional importance, preferentially preserving symbolic-computation tokens while pruning grammar and meta-discourse first Which tokens in reasoning chains actually matter most?. Both treat the token as the atom of causal weight.
An 'anchor,' by contrast, is about reach rather than position — a step whose effect propagates across many later steps. The corpus gets at this through error and dependency structure. Memorization-source analysis finds that 'local' memorization, keyed to the immediately preceding tokens, drives up to 67% of reasoning errors, which means most of a chain is locally chained rather than globally planned Where do memorization errors arise in chain-of-thought reasoning?. And the decomposition of CoT into output-probability, memorization, and genuine-but-error-accumulating reasoning shows that influence compounds step over step What three separate factors drive chain-of-thought performance?. An anchor is precisely a step early in that compounding cascade — its downstream footprint is large because everything after it inherits its framing.
So the mechanistic split is really about scope of causal influence: a forking token is a local, high-leverage switch whose effect is sharp and immediate, while a thought anchor is a step whose effect is broad and cumulative because the rest of the chain is built on top of it. There's a deeper unease underneath both, though. Faithfulness work shows that after fine-tuning, reasoning steps less reliably influence the final answer at all — early termination or filler substitution leaves answers unchanged — so the 'causal weight' we attribute to any token or step can be partly performative Does fine-tuning disconnect reasoning steps from final answers?. And the broader finding that CoT is pattern-guided imitation of reasoning form, not formal logic, suggests anchors and forks may be features of a learned format rather than load-bearing logical joints What makes chain-of-thought reasoning actually work? Does chain-of-thought reasoning reveal genuine inference or pattern matching?.
If you want the thing the corpus quietly reveals: 'which part of the reasoning matters' has no single answer because it's asked at two scales at once — the token switch and the sentence anchor — and the field hasn't fully reconciled whether either is steering the model or merely narrating a decision already made elsewhere. The latent-reasoning work pushes that further, showing models can scale reasoning entirely in hidden state with no verbalized tokens to anchor or fork at all Can models reason without generating visible thinking tokens?.
Sources 8 notes
Specific tokens like "Wait" and "Therefore" show sharp spikes in mutual information with correct answers. Suppressing them harms reasoning while suppressing equal random tokens does not, and representation recycling improves accuracy 20%.
Greedy likelihood-preserving pruning reveals six functional token categories; symbolic computation tokens are preferentially preserved while grammar and meta-discourse are pruned first. Student models trained on these pruned chains outperform those trained on frontier-model compression.
STIM framework identifies local, mid-range, and long-range memorization sources in CoT reasoning. Local memorization—based on preceding tokens—accounts for up to 67% of reasoning errors, especially as complexity increases and distributional shift occurs.
A shift cipher study decomposed CoT into three independent factors: output probability alone swings accuracy from 26% to 70%, memorization matches pre-training frequency patterns, and genuine reasoning exists but accumulates error with each step. This resolves the reason-or-memorize debate by showing LLMs do both simultaneously.
Three faithfulness tests show fine-tuned models generate reasoning chains that less reliably influence final outputs. Early termination, paraphrasing, and filler substitution all produce invariant answers more often after fine-tuning, suggesting reasoning becomes performative rather than functional.
Research shows training format shapes reasoning strategy 7.5× more than domain, demo position swings accuracy 20%, and invalid CoT prompts work as well as valid ones. CoT is pattern-guided generation, not formal logic.
CoT works by constraining models to reproduce familiar reasoning patterns from training, not by enabling novel symbolic reasoning. Performance degrades predictably under distribution shifts—the signature of imitation rather than capability emergence.
Multiple architectures—depth-recurrent models, Heima, and Coconut—demonstrate that test-time compute scales through hidden state iteration rather than token generation. This suggests verbalization is a training artifact, not a reasoning requirement.