What five design choices compose a world model?
World models are often presented as monolithic systems, but they actually involve five distinct design decisions—data preparation, representation, reasoning architecture, training objective, and decision integration—that can each fail independently. Understanding this decomposition helps diagnose why world model proposals fall short.
World model proposals often present themselves as monolithic — a video generator, a latent dynamics model, a foundation model. The Critiques of World Models essay argues this hides a structural fact: a world model is a composition of five distinct design choices, and any of them can be misaligned with the others. Treating the WM as a single thing makes it impossible to diagnose why it fails, because the failure could lie at any of the five layers — a decomposition that resolves the ambiguity flagged in Do LLMs actually have world models or just facts?.
The five aspects: (1) Identifying and preparing training data with the desired world information — what observations does the model see, and do they actually contain the structure needed for the intended downstream tasks? (2) Adopting a general representation space for the latent world state with possibly richer meaning than the observation data in plain sight — does the latent representation expose the right invariances for reasoning, or does it merely reconstruct the input? (3) Designing an architecture that allows effective reasoning over the representations — does the model support compositional, counterfactual, hierarchical operations, or only single-step prediction? (4) Choosing an objective that properly guides the model training — does the loss target the simulation-of-possibilities goal, or does it reward only observation reconstruction? (5) Determining how to use the world model in a decision-making system — how do the outputs of the WM feed into action selection, planning, or policy?
A WM that nails one or two of these and fails on the others is a coherent kind of failure: a video generator with stunning reconstruction quality (1, 2, 4) but no architecture for counterfactual queries (3) and no integration with decision-making (5) is not a world model in the functional sense, however impressive its outputs. Conversely, a model with rich representations but poor data coverage cannot simulate what its data did not expose.
The design pattern this exposes: when evaluating a proposal claiming to be a world model, decompose the claim into the five aspects and check each. Most of the disagreement in the WM literature is about which aspects matter and how they should be ordered, not about whether to build a WM at all. The five-aspect frame makes those disagreements explicit rather than letting them remain folded into vague terminology.
Inquiring lines that use this note as a source 5
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
- Why do foundation models develop heuristics instead of world models?
- Can a world model have rich representations without adequate data coverage?
- Why does integrating world models with decision-making systems matter?
- Why must world models be nested rather than flat and uniform?
- What's the difference between representing world facts and generating world mechanisms?
Related concepts in this collection 5
This note in its neighbourhood — explore the map, then jump to a related concept in the list below.
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
-
What should a world model actually be designed to do?
Current AI research treats world models as either video predictors or RL dynamics learners, but what if their real purpose is simulating actionable possibilities for decision-making rather than predicting next observations?
extends: companion piece — the goal definition picks aspect 5 (decision integration) and aspect 4 (objective) as primary
-
Do LLMs actually have world models or just facts?
The term 'world model' conflates two different capabilities: factual representation versus mechanistic understanding. Understanding which one LLMs actually possess matters for assessing their reasoning reliability.
complements: the five-aspect frame disambiguates the WM term that this earlier insight argued was conflated
-
Can language models simulate belief change in people?
Current LLM social simulators treat behavior as input-output mappings without modeling internal belief formation or revision. Can they be redesigned to actually track how people think and change their minds?
exemplifies: behaviorist social-sim agents fail aspect 2 (representation) and aspect 3 (architecture for counterfactuals) — concrete instance of the misalignment the framework predicts
-
Can we measure reasoning quality beyond output plausibility?
How might we evaluate whether AI systems reason internally like humans do, rather than just producing human-like outputs? This matters because surface coherence can mask broken underlying reasoning.
exemplifies: RECAP operationalizes aspect 4 (objective) as something measurable rather than vague
-
Can identical outputs hide broken internal representations?
Can neural networks produce correct outputs while having fundamentally fractured internal structure that prevents generalization and creativity? This challenges our assumptions about what performance benchmarks actually measure.
extends: aspect 2 (representation) failure mode — strong outputs from broken latents are exactly what the five-aspect decomposition exposes
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- Critiques of World Models
- What Has a Foundation Model Found? Using Inductive Bias to Probe for World Models
- Determinants of LLM-assisted Decision-Making
- Can Machines Think Like Humans? A Behavioral Evaluation of LLM-Agents in Dictator Games
- Divide-or-Conquer? Which Part Should You Distill Your LLM?
- Leveraging Pre-trained Large Language Models to Construct and Utilize World Models for Model-based Task Planning
- Systematic synthesis of design prompts for large language models in conceptual design
- Reasoning Language Models: A Blueprint
Original note title
building a world model decomposes into five inseparable design choices — data representation architecture objective and decision-system integration