How much does input format shape what reasoning strategy a model develops?
This explores how much the *shape* of what you feed a model — multiple-choice vs. free-form, dialogue vs. monologue, visible vs. hidden steps — determines the kind of reasoning it learns to do, separate from the actual subject matter.
This explores how much the format of inputs and outputs — not the content — steers the reasoning strategy a model develops. The corpus has a surprisingly blunt answer: format matters far more than most people assume, and in at least one measurement it dwarfs content entirely. The headline result is that training data *format* shapes reasoning strategy roughly 7.5 times more than the *domain* of the data Does training data format shape reasoning strategy more than domain?. Models trained on multiple-choice data learn to scan broadly across options (breadth-first), while free-form training pushes them to drill down a single line of thought (depth-first). The presentation, in other words, teaches the habit.
This lands harder when you see how shallow the reasoning underneath can be. One line of work shows that chain-of-thought is mostly pattern-guided generation rather than formal logic — demo position alone can swing accuracy 20%, and even logically *invalid* CoT prompts work about as well as valid ones What makes chain-of-thought reasoning actually work?. If the structure of the prompt does the heavy lifting, then format isn't a cosmetic wrapper around reasoning; it's a large part of the reasoning itself. That reframes a lot: when a small 1.5B model with only LoRA format-tuning matches much larger RL-trained models, the implication is that RL was largely teaching *output organization*, not new knowledge Can small models reason well by just learning output format?. Reasoning skill and stored knowledge turn out to be surprisingly separable.
The lateral surprise is that you can change reasoning *strategy* just by changing the conversational shape of the model's own output. DialogueReason restructures a single model's internal thinking as a back-and-forth between distinct agents, and that format change alone produces more diverse, less fragmented reasoning than the usual single-voice monologue — especially on problems that need several different approaches Can dialogue format help models reason more diversely?. So format doesn't just set breadth-vs-depth at training time; it can unlock or suppress whole strategies at inference time. And verbosity itself is a steerable knob — concise and verbose chains of thought live in distinct regions of activation space, so you can dial reasoning length up or down with a single vector and no retraining Can we steer reasoning toward brevity without retraining?.
But here's the twist that should make you skeptical of taking format at face value: the visible format may not be where the reasoning actually happens. Transformers trained to hide their chain-of-thought compute the correct answer in the *first few layers*, then actively overwrite those representations to emit format-compliant filler tokens — the real reasoning is recoverable underneath the surface output Do transformers hide reasoning before producing filler tokens?. Related work shows models can scale test-time reasoning entirely in latent space, with no verbalized steps at all, suggesting that writing out your thinking is a training artifact rather than a requirement Can models reason without generating visible thinking tokens?. So format powerfully shapes what reasoning *looks like* — and shapes the strategy a model adopts — but it isn't a transparent window onto what the model is actually computing.
The thing you didn't know you wanted to know: the same procedural backbone that lets a model reason well comes from broad, transferable patterns in pretraining rather than memorized facts Does procedural knowledge drive reasoning more than factual retrieval? — which is why format can act as such a strong lever. If reasoning is a reusable *procedure* rather than retrieved content, then changing the format is changing which procedure gets invoked. Format isn't decorating the reasoning; it's selecting it.
Sources 8 notes
Models trained on multiple-choice data adopt breadth-first exploration (Cohen's d up to 1.5), while free-form training produces depth-first reasoning. Format effect dwarfs domain effect, meaning presentation matters far more than content type.
Research shows training format shapes reasoning strategy 7.5× more than domain, demo position swings accuracy 20%, and invalid CoT prompts work as well as valid ones. CoT is pattern-guided generation, not formal logic.
A 1.5B parameter model with LoRA-only post-training matched larger full-parameter RL models on reasoning tasks, suggesting RL teaches output format organization rather than new factual knowledge. This efficiency indicates reasoning and knowledge storage are separable capabilities.
DialogueReason, which structures a single model's internal reasoning as dialogue between distinct agents in separate scenes, overcomes monologue reasoning's fixed-strategy and fragmented-attention weaknesses, especially on tasks requiring multiple problem-solving approaches.
Activation-Steered Compression extracts a single vector from 50 paired examples to reduce chain-of-thought length by 67% while maintaining accuracy and achieving 2.73x speedup. The method is training-free and generalizes across model sizes and domains.
Logit lens analysis shows models trained with hidden CoT tokens compute correct answers in layers 1-3, then actively suppress these representations in final layers to produce format-compliant filler output. The reasoning is fully recoverable from lower-ranked token predictions.
Multiple architectures—depth-recurrent models, Heima, and Coconut—demonstrate that test-time compute scales through hidden state iteration rather than token generation. This suggests verbalization is a training artifact, not a reasoning requirement.
Analysis of 5 million pretraining documents shows reasoning relies on broad, transferable procedural knowledge from diverse sources, unlike factual recall which depends on narrow, document-specific memorization of target facts.