SYNTHESIS NOTE
Training, RL, and Test-Time Scaling Model Architecture and Internals Reasoning, Retrieval, and Evaluation

Why do accurate predictions lead to poor decisions?

Predictive models are built to fit data, not to optimize decision outcomes. This note explores when and why accurate forecasts fail to produce good choices.

Synthesis note · 2026-02-22 · sourced from LLM Architecture
What kind of thing is an LLM really? How should researchers navigate LLM reasoning research?

"All AI Models Are Wrong, but Some are Optimal" (2501.06086) formalizes a gap that practitioners experience intuitively: accurate prediction does not guarantee good decisions. The paper establishes necessary and sufficient conditions for a predictive model (AI-based or not) to support optimal sequential decision-making.

The core problem: predictive models are typically constructed to approximate the real system's future behavior as closely as possible. But real systems are stochastic, and even with abundant data, the model is always an approximation. The construction of the predictive model is generally agnostic to the decision-making objectives — it has no direct relationship to the performance measure of the resulting decisions.

This matters because sequential decision-making requires accounting for future uncertainty, the availability of new information for future decisions, and both short- and long-term consequences. A model that predicts accurately on average may systematically mispredict in the states that matter most for decision quality. Since Can utility-weighted training loss actually harm model performance?, the mechanism is precise: the loss function shapes gradients for both representation learning and decision-making simultaneously, and optimizing one can weaken the other.

The connection to reward models is direct. Since Do reward models actually consider what the prompt asks?, reward models exhibit exactly this prediction-decision gap: they predict quality accurately on average but fail to condition on the decision-relevant information (the prompt). The formal framework here provides theoretical grounding for why prompt-insensitive reward models produce suboptimal alignment.

Since Why do language models fail to act on their own reasoning?, the prediction-decision gap manifests at the individual model level too: the model can predict what the right action is (rationale) but fails to execute it (greedy action). Good prediction, suboptimal decision.

Inquiring lines that use this note as a source 5

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 5

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
14 direct connections · 146 in 2-hop network ·dense cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

predictive AI models optimized for data fit produce suboptimal decisions — formal conditions define when prediction enables optimal policy