SYNTHESIS NOTE
Model Architecture and Internals

Has memory architecture replaced parameter count as the scaling frontier?

Late-2025 research suggests the field's next major efficiency gains come from restructuring how models store and use experience rather than simply making them larger. Three convergent signals point to this shift.

Synthesis note · 2026-05-18 · sourced from Memory
How should we allocate compute budget at inference time? What kind of thing is an LLM really?

Three pieces of late-2025 memory research, taken together, point at the same shift: parameter count has stopped being the most useful axis to scale. Memory architecture has taken its place.

Signal one: the field can finally taxonomize itself. Two major surveys (Memory in the Age of AI Agents, AI Hippocampus) appearing within months of each other propose orthogonal but compatible three-axis taxonomies — forms × functions × dynamics, and implicit × explicit × agentic. Surveys taxonomize after-the-fact; their existence at this density means the design space has matured to the point where comparing systems requires a shared vocabulary. Fields only develop that need when architecture is the primary variable being designed.

Signal two: memory and compute scale together, not separately. ReasoningBank's MaTTS finding shows that test-time scaling generates contrastive signals, which improve memory, which guides future scaling — a compounding loop. This makes memory-driven experience scaling a new scaling law rather than a multiplier on existing ones. Parameter scaling laws (Kaplan, Chinchilla) predict loss as a function of compute and data; MaTTS suggests an additional term: cumulative interaction history processed into structured memory.

Signal three: sparsity is multi-dimensional. Engram's U-shaped scaling law shows that conditional memory and conditional computation are complementary sparsity axes — pure MoE underperforms hybrid MoE+lookup at iso-parameter, iso-FLOPs. The largest gains appear in reasoning, not retrieval, because separating local lookup from global integration frees attention for composition. Parameters distributed across memory and computation outperform parameters concentrated in either alone.

The convergent story: returns from adding parameters are diminishing along a known curve; returns from restructuring memory are still in their early steep phase. This does not mean parameters stop mattering. It means the marginal next-generation improvement is more likely to come from architectural restructuring of memory than from another order of magnitude in size.

The counter-evidence — and why it sharpens rather than undermines the take. "Useful Memories Become Faulty" demonstrates that naive consolidation can regress below the no-memory baseline. This is exactly what should be expected if memory architecture is the bottleneck: the design choices in how to maintain memory matter more than whether to have it. The fragility is itself evidence that memory is the active variable. Parameter-count scaling does not have the same brittleness — adding parameters rarely makes a model worse. Adding consolidation can.

The writing angle: the prior scaling law era was about pretraining compute. The current era is about memory structures that determine how experience gets converted into improved behavior — and that conversion mechanism is now the design problem.

Inquiring lines that use this note as a source 11

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 7

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
14 direct connections · 115 in 2-hop network ·medium cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

memory architecture is the new scaling dimension — taxonomy surveys plus MaTTS plus Engram U-curve suggest memory has overtaken parameter count