Why do language models tend to elaborate and expand rather than compress information?
This explores a real paradox in the corpus: LLMs compress aggressively *inside* their representations, yet their *outputs* tend to sprawl and elaborate — so the question is really about where the expansion comes from.
This explores why language model *outputs* run long and elaborative even though, under the hood, these models compress information harder than humans do. The most direct corpus answer is that there are two different things going on, and they pull in opposite directions. Internally, LLMs are ruthless compressors. Measured with rate-distortion theory against human cognitive data, models capture broad category structure but throw away the fine-grained, context-dependent distinctions humans preserve — they maximize statistical efficiency where humans trade efficiency for situated meaning Do LLMs compress concepts more aggressively than humans do?. So the elaboration you see in the text is not the model failing to compress its *knowledge*. It's compressing the concepts and then spending many tokens emitting them.
Why spend the tokens? Part of the answer is that the surface text and the actual computation have come unhooked from each other. When models are trained to hide their reasoning, the correct answer is computed in the first few layers and then actively *overwritten* in later layers to produce format-compliant filler — the real reasoning is still recoverable from lower-ranked predictions, but what gets printed is padding Do transformers hide reasoning before producing filler tokens?. Expansion, in other words, can be a performance: the model has already decided, and the extra prose is downstream of formatting pressure rather than thinking.
That formatting pressure has a name in the corpus — it's a training incentive. RLHF rewards models for producing confident, complete-looking answers rather than for asking a clarifying question or stopping short, which is exactly the behavior that makes them lock in early and then justify at length Why do language models lose performance in longer conversations?. A related dynamic shows up in exploration: uncertainty signals dominate the early transformer blocks while longer-horizon 'empowerment' signals only emerge in the middle layers, so models commit before the signals that would make them hold back ever arrive Why do large language models explore less effectively than humans?. Premature commitment plus a reward for looking complete is a recipe for confident elaboration.
The genuinely surprising thread is that compression and selectivity *do* happen — just not in the visible output. Under hard, out-of-distribution tasks, hidden states sparsify in a localized, systematic way that acts as a stabilizing filter, not a breakdown Do language models sparsify their activations under difficult tasks?. So the model is quietly narrowing internally at the same moment its prose may be widening. The thing you didn't know you wanted to know: 'compress vs. elaborate' isn't one knob. These systems compress meaning and inflate text simultaneously, and most of what reads as verbosity is a training-shaped output habit layered on top of a compression engine.
Sources 5 notes
Using Rate-Distortion Theory on cognitive datasets, LLMs capture broad category structure but lose fine-grained distinctions humans preserve. LLMs maximize compression efficiency; humans trade compression for contextual meaning that enables situated action.
Logit lens analysis shows models trained with hidden CoT tokens compute correct answers in layers 1-3, then actively suppress these representations in final layers to produce format-compliant filler output. The reasoning is fully recoverable from lower-ranked token predictions.
LLMs degrade in multi-turn settings because RLHF training rewards premature answers over clarification-seeking, creating pragmatic mismatch with individual user behaviors. A Mediator-Assistant architecture that explicitly parses user intent before execution recovers lost performance without retraining.
SAE decomposition shows uncertainty values dominate early transformer blocks while empowerment representations emerge only in middle blocks. This temporal mismatch causes models to commit to decisions before long-term exploration signals can influence them. Reasoning-trained o1 overcomes this by extending computation time.
As task difficulty increases, LLM hidden states become substantially sparser in a localized, systematic way that correlates with task unfamiliarity and reasoning load. This sparsification acts as a selective filter stabilizing performance under OOD shift rather than a failure mode.