INQUIRING LINE

What computational role do intermediate tokens actually play in transformers?

This explores whether the visible reasoning tokens a transformer generates — the chains-of-thought, the 'thinking out loud' — are actually where the computation happens, or whether they're doing something stranger.


This explores whether the intermediate tokens a transformer writes out — its reasoning chains, filler, scratch work — are the actual site of computation, or just a surface that the real work hides behind. The corpus points to an unsettling answer: the tokens you can read are often not where the thinking lives, and not all of them are pulling weight.

The most direct evidence that visible tokens can be decoupled from reasoning comes from work showing transformers compute the correct answer in their earliest layers and then actively overwrite it to emit format-compliant filler Do transformers hide reasoning before producing filler tokens?. The reasoning is real but buried — recoverable from lower-ranked predictions — while the token actually printed is a kind of cover story. This dovetails with the finding that you can scale a model's reasoning entirely in latent space, iterating on hidden states without ever verbalizing a step Can models reason without generating visible thinking tokens?. Put together, they suggest verbalization is a training artifact, not a computational requirement: the tokens are downstream of the work, not the work itself.

Then there's the question of whether the tokens even need to *mean* anything. Models trained on deliberately corrupted, irrelevant reasoning traces perform comparably to those trained on correct ones Do reasoning traces need to be semantically correct? — which implies the trace functions as computational scaffolding (a place to spend compute, hold state, extend the forward pass) rather than as a chain of meaningful inferences. But 'scaffolding' isn't uniform. When you prune reasoning chains by functional importance, distinct categories emerge: symbolic-computation tokens get preferentially preserved while grammar and meta-discourse get dropped first Which tokens in reasoning chains actually matter most?. And in reinforcement learning, only about 20% of tokens — the high-entropy 'forking points' where the model faces a real branch — carry the learning signal; training on just those matches full updates Do high-entropy tokens drive reasoning model improvements?. So a minority of tokens are genuine decision pivots, and the rest are connective tissue.

Laterally, this connects to a deeper view of what tokens are doing at all. One framing treats the transformer's residual stream as continuous *flow* rather than storage — knowledge exists only in the act of generation, more like oral performance than retrieval from an archive Do transformer models store knowledge or generate it continuously?. Under that lens, intermediate tokens are checkpoints in an ongoing computation, not records of stored thought. And at the theoretical ceiling, a single finite transformer can in principle become Turing-complete given the right prompt, with intermediate tokens acting as the program's working tape Can a single transformer become universally programmable through prompts? — though standard training rarely produces models that actually use them that way.

The thing you might not have known you wanted to know: the model's visible reasoning is closer to a *workspace* than a *transcript*. Some tokens are load-bearing decision points, many are scaffolding that need not be true, and the answer itself may already exist in the hidden layers before a single 'reasoning' word is written. The chain-of-thought you read is partly genuine computation, partly theater the model performs because we trained it to.


Sources 7 notes

Do transformers hide reasoning before producing filler tokens?

Logit lens analysis shows models trained with hidden CoT tokens compute correct answers in layers 1-3, then actively suppress these representations in final layers to produce format-compliant filler output. The reasoning is fully recoverable from lower-ranked token predictions.

Can models reason without generating visible thinking tokens?

Multiple architectures—depth-recurrent models, Heima, and Coconut—demonstrate that test-time compute scales through hidden state iteration rather than token generation. This suggests verbalization is a training artifact, not a reasoning requirement.

Do reasoning traces need to be semantically correct?

Models trained on systematically irrelevant traces maintain solution accuracy and sometimes improve out-of-distribution generalization, suggesting traces function as computational scaffolding rather than meaningful reasoning steps.

Which tokens in reasoning chains actually matter most?

Greedy likelihood-preserving pruning reveals six functional token categories; symbolic computation tokens are preferentially preserved while grammar and meta-discourse are pruned first. Student models trained on these pruned chains outperform those trained on frontier-model compression.

Do high-entropy tokens drive reasoning model improvements?

Only ~20% of tokens exhibit high entropy as pivotal reasoning decision points; RLVR primarily adjusts these forking tokens. Training exclusively on them matches or exceeds full-gradient performance, revealing that the minority carries the learning signal.

Do transformer models store knowledge or generate it continuously?

Transformers organize knowledge as flowing activations rather than retrievable archives, mirroring oral cultures where knowledge exists only in performance. This explains why model knowledge is contextual, difficult to edit, and inseparable from generation.

Can a single transformer become universally programmable through prompts?

Research proves a single finite-size transformer exists that can compute any computable function given the right prompt, achieving complexity bounds nearly matching unbounded models. However, standard training rarely produces models that learn to implement arbitrary programs this way.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a researcher re-testing claims about whether intermediate tokens in transformers are genuine computation sites or scaffolding. The question remains open: what computational role do the tokens a model writes—its reasoning chains, filler, working steps—actually play?

What a curated library found — and when (dated claims, not current truth):
Findings span 2023–2026; treat these as perishable constraints to re-examine:
• Transformers compute correct answers in early layers, then overwrite them with format-compliant output; reasoning is recoverable from hidden states, not the printed token (~2024–2025).
• Latent reasoning scales test-time compute without verbalization; verbalization may be a training artifact, not computational necessity (~2025).
• Models trained on deliberately corrupted reasoning traces perform comparably to correct ones, suggesting tokens function as scaffolding rather than meaningful inference (~2025).
• ~20% of tokens are high-entropy 'forking points' carrying the learning signal in RL; training on just those matches full updates (~2026).
• Transformer residual streams transmit knowledge as continuous flow, not stored state; tokens are checkpoints in ongoing computation (~2024).

Anchor papers (verify; mind their dates):
• arXiv:2412.04537 (2024-12) — Understanding Hidden Computations in Chain-of-Thought Reasoning
• arXiv:2502.05171 (2025-02) — Scaling up Test-Time Compute with Latent Reasoning
• arXiv:2505.13775 (2025-05) — Beyond Semantics: The Unreasonable Effectiveness of Reasonless Intermediate Tokens
• arXiv:2506.01939 (2026-01) — Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective RL

Your task:
(1) RE-TEST EACH CONSTRAINT. For the five findings above, determine whether post-2026 models, improved mechanistic interpretability methods, new training regimes (e.g., process reward models, constitutional AI), or evaluations have relaxed or overturned these claims. Separate the durable question (e.g., 'Are all tokens equally important?') from the perishable limitation (e.g., 'Older models hide reasoning in layer 8'). Cite what relaxed or resolved each constraint.
(2) SURFACE THE STRONGEST DISAGREEMENT: identify the most recent work (last 6 months) that contradicts the 'scaffolding' framing or restores semantic meaning to intermediate tokens. Where does the literature diverge?
(3) PROPOSE two new research questions that assume the regime may have shifted — e.g., whether emergent reasoning-in-latent-space breaks down at scale, or whether newer pruning techniques identify a *different* token ranking than entropy-based importance.

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Next inquiring lines