INQUIRING LINE

Inquiring lines›How should we train models for cap…›What systematic failures and vulne…›Does fine-tuning modify underlying…›this inquiring line

Train an AI on math before creative writing and you get a measurably better model — the sequence quietly shapes what it becomes.

Why does training order matter across different domain types?

This explores why the *sequence* in which you train a model on different kinds of tasks (structured vs. creative, rare vs. common, easy vs. hard) changes the outcome — and what mechanically drives that.

This explores why the order of training matters when a model learns across different kinds of domains — and the corpus suggests the answer is mechanical, not pedagogical: different domain types push a model's internal state in opposite directions, so whichever you train first leaves a mark the next stage has to fight against. The clearest case is entropy. Work on multi-task reinforcement learning finds that structured domains (math, code) *shrink* a model's output entropy while creative, open-ended domains *grow* it — so training the structured tasks first, then loosening into creative ones, prevents an early entropy collapse from quietly killing open-ended ability later, worth a measurable gain over throwing everything in at once Does training order reshape how models handle different task types?. Order matters because the domains have complementary, not additive, effects on the same internal dial.

The second reason is that 'difficulty' isn't what we think it is. The intuitive curriculum — easy things first, like teaching a child — gets overturned by work showing that ordering by *rarity* (rare data first) beats it, because what limits a fine-tuned model isn't conceptual hardness but distance from its pretraining distribution Does ordering training data by rarity actually improve language models?. Curriculum, reframed, is about managing where each domain sits relative to what the model already knows — which is exactly why the *order* of domains, not just their content, determines the result. A related ordering signal shows up inside in-context learning too, where representation sparsity (a proxy for which examples are hard for *this* model) can sequence demonstrations without any external difficulty labels Can representation sparsity order few-shot demonstrations effectively?.

There's a deeper twist: across domains, *how* data is shaped can dominate *what* domain it's from. One striking finding is that training data format shapes reasoning strategy about 7.5× more strongly than domain content — multiple-choice data produces breadth-first explorers, free-form data produces depth-first reasoners Does training data format shape reasoning strategy more than domain?. So 'domain type' is partly a stand-in for the structural conventions a domain carries, and the first format a model commits to can crowd out the rest — which is also why RL post-training tends to collapse onto a single dominant pretraining format within the first epoch, suppressing alternatives Does RL training collapse format diversity in pretrained models?.

Order also matters because earlier training can do hidden, hard-to-reverse damage. Domain adaptation methods each have a 'sweet spot' tied to their domain, and visible gains often come bundled with invisible costs — degraded reasoning faithfulness, lost format flexibility, weaker transfer How do domain training techniques actually reshape model behavior?. Fine-tuning in particular can make a model's reasoning chains *performative*: the steps stop actually driving the answer Does fine-tuning disconnect reasoning steps from final answers?. And there's an architectural reason the damage is order-sensitive — pretraining writes factual knowledge into lower layers while fine-tuning reshapes behavior in upper layers Do pretraining and fine-tuning scale independently in language models?, so a later stage that corrupts lower-layer storage can erase earlier knowledge, which is why decoding-time approaches that leave base weights untouched preserve knowledge far better Can decoding-time tuning preserve knowledge better than weight fine-tuning?.

The thing worth taking away: 'training order matters' isn't really about scheduling — it's about the fact that different domains move different internal machinery (entropy, format priors, layer-specific storage) in conflicting directions, and order decides which conflict wins. The contrarian corner of the corpus pushes even further — that *structure* can beat both order and volume: organizing knowledge into a taxonomy first lets a model hit half of full-corpus performance on 0.3% of the data, by learning where facts sit in a conceptual map rather than absorbing raw text Can organizing knowledge structures beat raw training data volume?.

Sources 10 notes

Does training order reshape how models handle different task types?

Omni-Thinker shows structured domains decrease output entropy while creative domains increase it. BWT-guided scheduling—training structured tasks first—yields 6.2% gains over joint training by preventing entropy collapse from damaging open-ended capabilities.

Does ordering training data by rarity actually improve language models?

CTFT fine-tunes LLMs on rare data first because rarity signals distributional weakness, not conceptual difficulty. This reframes curriculum learning as managing distance from pre-training distribution rather than pedagogical scaffolding.

Can representation sparsity order few-shot demonstrations effectively?

Sparsity-Guided Curriculum In-Context Learning uses last-layer activation sparsity to order demonstrations from sparse (harder) to dense (easier), yielding considerable performance improvements. This approach requires no external difficulty labels and works across diverse in-context learning tasks.

Does training data format shape reasoning strategy more than domain?

Models trained on multiple-choice data adopt breadth-first exploration (Cohen's d up to 1.5), while free-form training produces depth-first reasoning. Format effect dwarfs domain effect, meaning presentation matters far more than content type.

Does RL training collapse format diversity in pretrained models?

Controlled experiments show RL consistently amplifies one format distribution from pretraining within the first epoch while collapsing alternatives. The winning format depends on model scale, not necessarily performance, and is largely hidden when starting from proprietary pretrained models.

Show all 10 sources

How do domain training techniques actually reshape model behavior?

Research shows every adaptation method—from parameter-efficient tuning to knowledge graph curricula—has optimal conditions tied to specific domains. The key finding: visible benefits like performance gains often come with hidden degradation in reasoning faithfulness, capability transfer, and format flexibility.

Does fine-tuning disconnect reasoning steps from final answers?

Three faithfulness tests show fine-tuned models generate reasoning chains that less reliably influence final outputs. Early termination, paraphrasing, and filler substitution all produce invariant answers more often after fine-tuning, suggesting reasoning becomes performative rather than functional.

Do pretraining and fine-tuning scale independently in language models?

Emulated Fine-Tuning reveals that scaling pretraining improves factual knowledge while scaling fine-tuning improves behavioral helpfulness. This decoupling has architectural roots: pretraining enriches lower-layer knowledge storage, while fine-tuning modifies upper-layer behavior expression.

Can decoding-time tuning preserve knowledge better than weight fine-tuning?

Proxy-tuning closes 88-91% of the alignment gap while surpassing direct fine-tuning on knowledge tasks by leaving base model weights untouched. Direct fine-tuning corrupts knowledge storage in lower layers, whereas proxy-tuning applies distributional shifts that primarily affect reasoning and style.

Can organizing knowledge structures beat raw training data volume?

StructTuning achieves 50% of full-corpus performance using only 0.3% of training data by organizing chunks into auto-generated domain taxonomies. The model learns knowledge position within conceptual structures rather than raw text patterns, matching how students learn from textbooks.

Papers this line draws on 8

The research behind the notes this line reads — ranked by how closely each paper relates.

Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining4.21 match · arxiv ↗
AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts2.47 match · arxiv ↗
Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?2.44 match · arxiv ↗
An Emulator for Fine-Tuning Large Language Models using Small Language Models1.67 match · arxiv ↗
Stop Anthropomorphizing Intermediate Tokens as Reasoning/Thinking Traces!1.67 match · arxiv ↗
Tuning Language Models by Proxy1.67 match · arxiv ↗
Injecting Domain-Specific Knowledge into Large Language Models: A Comprehensive Survey1.64 match · arxiv ↗
Planted in Pretraining, Swayed by Finetuning: A Case Study on the Origins of Cognitive Biases in LLMs1.63 match · arxiv ↗

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a research analyst re-testing claims about training order effects across domains in LLMs. The question: Why does training order matter when a model learns across different domain types — and has this understanding shifted since mid-2023?

What a curated library found — and when (dated claims, not current truth):
Findings span 2023–2026; treat as perishable.
• Structured domains (math, code) shrink output entropy; creative domains expand it. Training structured-first prevents entropy collapse that kills open-ended ability later (~2025, multi-task RL).
• Ordering by *rarity* (rare data first) beats intuitive easy-to-hard curriculum, because distance from pretraining distribution, not conceptual hardness, limits fine-tuning (~2024–2025).
• Training data *format* shapes reasoning strategy ~7.5× more strongly than domain content; first format crowds out alternatives (~2025).
• RL post-training collapses onto a single dominant pretraining format within first epoch, suppressing others (~2025).
• Fine-tuning degrades chain-of-thought faithfulness: reasoning steps become performative, not causal (~2024).
• Structuring knowledge into taxonomy first achieves ~50% of full-corpus performance on 0.3% of data (~2024).

Anchor papers (verify; mind their dates):
• arXiv:2407.16724 (2024-07): Structure-aware knowledge injection
• arXiv:2411.15382 (2024-11): Fine-tuning impact on CoT reasoning
• arXiv:2504.07912 (2025-04): RL post-training format amplification
• arXiv:2603.03415 (2026-03): OOD representation sparsity mechanisms

Your task:
(1) RE-TEST EACH CONSTRAINT. For entropy dynamics, format dominance, and CoT degradation: has newer scaling, multi-stage training (e.g., continued pretraining before RL), or better-instrumented RL changed these? Where do the mechanical constraints still hold? Separate durable mechanism (entropy/format interaction) from perishable limitation (e.g., "first format always wins").
(2) Surface the strongest work from the last 6 months that *contradicts* the format-dominance or format-collapse claims, or shows order-invariance under certain conditions.
(3) Propose 2 research questions that assume the regime may have moved: (a) Does curriculum-order sensitivity decrease with model scale or architectural innovation? (b) Can explicit format-mixing during pretraining inoculate against post-training format collapse?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Train an AI on math before creative writing and you get a measurably better model — the sequence quietly shapes what it becomes.

Related lines of inquiry

Sources 10 notes

Papers this line draws on 8