Can search escape the entropy shell of language models?
Autoregressive search is confined to a narrow region around a model's learned probability mass. What techniques could break through this boundary and reach solutions the model alone rarely produces?
The framing in Bidirectional Evolutionary Search (BES) is sharper than its method. It diagnoses two coupled failures shared by best-of-N and tree search: candidates are built almost entirely by autoregressive expansion, which confines them to a "narrow entropy shell" around the model's probability mass; and they are steered by sparse verification, a signal that only arrives at the end. BES attacks each axis separately — forward search adds evolution operators that recombine partial trajectories into candidates a single rollout would rarely produce, and backward search recursively decomposes the task into checkable sub-goals that supply dense intermediate feedback. The theoretical claim is that recombination escapes the entropy shell, and that backward decomposition can exponentially cut the samples needed to hit a correct answer.
This consolidates a cluster the vault has been circling. Can evolutionary search beat sampling and revision at inference time? is the direct forward-search precedent; How should we balance parallel versus sequential compute at test time? frames the dichotomy that BES claims to transcend with population-based recombination. The backward half is the more novel synthesis: it operationalizes Does planning direction affect how hard problems become? as a feedback-density mechanism rather than only a search-space pruner — sub-goals are valued because they are checkable, turning a sparse terminal signal into a dense one.
The honest caveat is that recombining partial trajectories assumes those fragments compose into coherent wholes, which is exactly where natural-language reasoning is brittle; "escapes the entropy shell" can also mean "drifts into incoherent regions the verifier was never calibrated for." And backward decomposition only helps when sub-goals are genuinely verifiable — on open-ended tasks the decomposition step inherits the same generation problem it was meant to bypass. It connects naturally to Can AI systems improve themselves through trial and error?: both replace a clean verification signal with empirical, recombinative search, and both live or die on the quality of the cheap check.
Related concepts in this collection 4
This note in its neighbourhood — explore the map, then jump to a related concept in the list below.
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
-
Can evolutionary search beat sampling and revision at inference time?
Can LLMs evolve populations of solutions through recombination and selection to outperform simpler inference strategies? This matters because it could reveal whether biological-inspired search improves planning without formal problem definitions.
extends (BES generalizes forward evolutionary search and pairs it with backward decomposition)
-
How should we balance parallel versus sequential compute at test time?
Test-time compute can prioritize breadth (trying many approaches) or depth (refining one approach). Which strategy works better, and does the answer depend on the problem?
extends (population recombination as a third mode beyond the parallel/sequential split)
-
Does planning direction affect how hard problems become?
Planning research typically goes forward only. But some problems get easier when you work backward from the goal. What makes direction matter, and can language models exploit this?
grounds (reframes backward planning as a dense-feedback source)
-
Can AI systems improve themselves through trial and error?
Explores whether replacing formal proof requirements with empirical benchmark testing enables AI systems to successfully modify and improve their own code iteratively, and what mechanisms prevent compounding failures.
convergent-with (empirical recombinative search in place of a clean verifier)
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- Self-Improving Language Models with Bidirectional Evolutionary Search
- Tree of Thoughts: Deliberate Problem Solving with Large Language Models
- Bilevel Autoresearch: Meta-Autoresearching Itself
- Vector Policy Optimization: Training for Diversity Improves Test-Time Search
- Large Language Diffusion Models
- Language Modeling by Language Models
- Stream of Search (SoS): Learning to Search in Language
- Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search
Original note title
autoregressive search cannot leave the entropy shell of the model that generated it — escaping requires recombination forward and decomposition backward