INQUIRING LINE

Inquiring lines›How does AI reshape human reasonin…›How do scale, context, and measure…›How should memory consolidation st…›this inquiring line

Everyone agrees AI agents need memory organized along three axes — but no two researchers agree on which three.

How do the three-axis taxonomies of memory forms and functions differ?

This explores the different ways recent work carves agent memory into a small number of orthogonal axes — and how those carvings disagree about what the 'three' should even be.

This reads the question as: when researchers say memory has three axes, they don't all mean the same three — so what's actually being distinguished? The corpus has at least three competing schemes, and the interesting part is that they slice along different planes.

The most explicit one comes from a 2025 survey that proposes forms (token, parametric, latent), functions (factual, experiential, working), and dynamics (formation, evolution, retrieval) Can three axes replace the short-term long-term memory split?. Its argument is that the familiar short-term/long-term split isn't a real architectural axis at all — it's an emergent pattern that falls out of the dynamics axis (how memory forms and decays over time). So 'forms' is about *where* memory lives, 'functions' is about *what it's for*, and 'dynamics' is about *how it changes*. These are meant to be genuinely orthogonal: any real system is a point in all three at once.

A different three-way cut maps memory onto the brain rather than onto abstract properties: transformer weights as a consolidated neocortex, retrieval/RAG stores as hippocampal rapid encoding, and agentic state as prefrontal executive control Can brain memory systems explain how LLMs should store knowledge?. Notice this overlaps the survey's 'forms' axis (weights ≈ parametric, retrieval ≈ token) but smuggles in function too — the prefrontal/agentic tier is defined by what it does, not where it sits. So this taxonomy collapses two of the survey's supposedly-independent axes into one biological story, which is exactly the kind of conflation the forms/functions/dynamics framing is trying to pull apart.

Then there are taxonomies that aren't three-axis at all but get mistaken for siblings. RAISE decomposes agent working memory into four components across two granularities — dialogue-level (conversation history, scratchpad) vs. turn-level (examples, task trajectory) How should agent memory split across time scales?. That's a 2×2, and it lives *entirely inside* the survey's 'working' function — so it's not a rival taxonomy, it's a zoom-in on one cell. Similarly, the STIM work splits chain-of-thought memorization into local, mid-range, and long-range sources Where do memorization errors arise in chain-of-thought reasoning? — a three-way cut, but along *distance*, a single dimension, not three orthogonal ones.

The payoff of seeing these side by side: a 'three-axis taxonomy' can mean three independent dimensions (forms/functions/dynamics), three instances along one dimension (memorization by distance), or a brain analogy that quietly bundles dimensions together. The reason this matters is practical — the survey's whole claim is that you can only compare two memory systems precisely if your axes are actually orthogonal, and that memory structure, not parameter count, is now the live scaling frontier Has memory architecture replaced parameter count as the scaling frontier?. Taxonomies that conflate where/what/how make that comparison impossible, which is the real difference between them.

Sources 5 notes

Can three axes replace the short-term long-term memory split?

A 2025 survey reframes agent memory along forms (token/parametric/latent), functions (factual/experiential/working), and dynamics (formation/evolution/retrieval), showing that short/long-term phenomena emerge from temporal patterns rather than architectural separation. This enables precise system comparison and replaces vague implementation-based claims.

Can brain memory systems explain how LLMs should store knowledge?

Research shows transformer weights function as a distributed neocortex for consolidated knowledge, RAG stores as hippocampal indexing for rapid encoding, and agentic state as prefrontal executive control. The CLS framework predicts why hybrid systems outperform single-tier approaches and identifies missing consolidation mechanisms that prevent memory integration.

How should agent memory split across time scales?

RAISE shows that agent memory consists of four components organized by two design axes: dialogue-level (conversation history, scratchpad) versus turn-level (examples, task trajectory). This granularity distinction predicts different failure modes and update policies for each component.

Where do memorization errors arise in chain-of-thought reasoning?

STIM framework identifies local, mid-range, and long-range memorization sources in CoT reasoning. Local memorization—based on preceding tokens—accounts for up to 67% of reasoning errors, especially as complexity increases and distributional shift occurs.

Has memory architecture replaced parameter count as the scaling frontier?

Three converging signals in late-2025 research—taxonomy maturation, memory-aware test-time scaling loops, and hybrid sparsity laws—show that returns from restructuring memory now exceed returns from adding parameters. The design bottleneck has shifted from compute to memory structure.

Papers this line draws on 8

The research behind the notes this line reads — ranked by how closely each paper relates.

The AI Hippocampus: How Far are We From Human Memory?1.74 match · arxiv ↗
Memory in the Age of AI Agents: A Survey — Forms, Functions and Dynamics1.74 match · arxiv ↗
Useful Memories Become Faulty When Continuously Updated by LLMs1.70 match · arxiv ↗
Rethinking Memory as Continuously Evolving Connectivity1.69 match · arxiv ↗
Are We Ready For An Agent-Native Memory System?1.69 match · arxiv ↗
From Model Scaling to System Scaling: Scaling the Harness in Agentic AI1.68 match · arxiv ↗
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems1.67 match · arxiv ↗
Diagnosing Memorization in Chain-of-Thought Reasoning, One Token at a Time0.93 match · arxiv ↗

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a researcher stress-testing taxonomies of LLM memory. The question remains open: do three-axis memory schemes actually carve nature at its joints, or do they conflate independent dimensions?

What a curated library found — and when (dated claims, not current truth):
• A 2025 survey proposes forms (token, parametric, latent), functions (factual, experiential, working), and dynamics (formation, evolution, retrieval) as genuinely orthogonal axes; claims short-term/long-term is emergent, not fundamental (~2025).
• A competing biological taxonomy maps transformer weights→neocortex, retrieval/RAG→hippocampus, agentic state→prefrontal cortex; conflates the survey's 'forms' and 'functions' into one (~2026).
• RAISE decomposes agent working memory into a 2×2 (dialogue-level vs. turn-level components), which lives entirely inside one cell of the survey's taxonomy (~2025).
• Token-level memorization in chain-of-thought splits into three distance-based sources (local, mid-range, long-range) — a one-dimensional cut, not three orthogonal ones (~2025).
• The survey argues memory *architecture*, not parameter count, is now the scaling frontier; taxonomies that conflate where/what/how make rigorous system comparison impossible (~2025).

Anchor papers (verify; mind their dates):
• arXiv:2512.13564 — Memory in the Age of AI Agents: Forms, Functions, Dynamics (2025-12)
• arXiv:2601.09113 — The AI Hippocampus: How Far are We From Human Memory? (2026-01)
• arXiv:2508.02037 — Diagnosing Memorization in Chain-of-Thought Reasoning, One Token at a Time (2025-08)
• arXiv:2512.24601 — Recursive Language Models (2025-12)

Your task:
(1) RE-TEST ORTHOGONALITY. For each proposed axis (forms, functions, dynamics; or the biological triple), ask: has newer work (last 6 months) shown these dimensions genuinely decouple when systems scale, or do they remain entangled in practice? Does memory *architecture* actually predict scaling, or does parameter count still dominate? Separate the durable claim (orthogonal axes exist) from the perishable one (we've identified the right three).
(2) Surface work from late 2025–present that explicitly *rejects* any of these three-axis framings or proposes a fourth/fifth dimension. Flag disagreements on whether short-term/long-term is emergent or fundamental.
(3) Propose two research questions that assume the regime has shifted: (a) Are these taxonomies converging on a stable set of axes, or are new memory modalities (e.g., learned router states, in-context adapter weights, recursive embeddings) forcing redimensionalization? (b) If architecture matters for scaling, how do you measure "architecture distance" between two systems to predict capability leaps?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Everyone agrees AI agents need memory organized along three axes — but no two researchers agree on which three.

Related lines of inquiry

Sources 5 notes

Papers this line draws on 8