SYNTHESIS NOTE
Psychology, Society, and Alignment Agentic Systems and Tool Use Reasoning, Retrieval, and Evaluation

What five design choices compose a world model?

World models are often presented as monolithic systems, but they actually involve five distinct design decisions—data preparation, representation, reasoning architecture, training objective, and decision integration—that can each fail independently. Understanding this decomposition helps diagnose why world model proposals fall short.

Synthesis note · 2026-05-03 · sourced from World Models

World model proposals often present themselves as monolithic — a video generator, a latent dynamics model, a foundation model. The Critiques of World Models essay argues this hides a structural fact: a world model is a composition of five distinct design choices, and any of them can be misaligned with the others. Treating the WM as a single thing makes it impossible to diagnose why it fails, because the failure could lie at any of the five layers — a decomposition that resolves the ambiguity flagged in Do LLMs actually have world models or just facts?.

The five aspects: (1) Identifying and preparing training data with the desired world information — what observations does the model see, and do they actually contain the structure needed for the intended downstream tasks? (2) Adopting a general representation space for the latent world state with possibly richer meaning than the observation data in plain sight — does the latent representation expose the right invariances for reasoning, or does it merely reconstruct the input? (3) Designing an architecture that allows effective reasoning over the representations — does the model support compositional, counterfactual, hierarchical operations, or only single-step prediction? (4) Choosing an objective that properly guides the model training — does the loss target the simulation-of-possibilities goal, or does it reward only observation reconstruction? (5) Determining how to use the world model in a decision-making system — how do the outputs of the WM feed into action selection, planning, or policy?

A WM that nails one or two of these and fails on the others is a coherent kind of failure: a video generator with stunning reconstruction quality (1, 2, 4) but no architecture for counterfactual queries (3) and no integration with decision-making (5) is not a world model in the functional sense, however impressive its outputs. Conversely, a model with rich representations but poor data coverage cannot simulate what its data did not expose.

The design pattern this exposes: when evaluating a proposal claiming to be a world model, decompose the claim into the five aspects and check each. Most of the disagreement in the WM literature is about which aspects matter and how they should be ordered, not about whether to build a WM at all. The five-aspect frame makes those disagreements explicit rather than letting them remain folded into vague terminology.

Inquiring lines that use this note as a source 5

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 5

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
13 direct connections · 123 in 2-hop network ·dense cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

building a world model decomposes into five inseparable design choices — data representation architecture objective and decision-system integration