SYNTHESIS NOTE
Agentic Systems and Tool Use

Does agent memory work better at one level of abstraction?

Three competing architectures claim superior agent memory transfer using different abstraction levels. Do they all work, or does one architecture genuinely outperform the others across domains?

Synthesis note · 2026-05-03 · sourced from Action Models

Three papers from the agentic cluster — AWM, CLIN, and PRAXIS — each propose a different shape for agent memory and each report transfer gains: AWM extracts abstracted sub-task workflows ("search for a {product-name} on Amazon"), CLIN extracts causal abstractions ("opening doors may be necessary for movement between rooms"), PRAXIS extracts state-dependent local action recall. The papers claim incompatible answers because they implicitly answer different questions. The resolution is not "one wins" but "each wins in the domain where its abstraction matches the structure of the task."

Three domain-shape signatures predict three memory shapes:

Routine-rich domains (e-commerce flows, customer-service scripts, repetitive browser tasks): the variance is in arguments, not in topology. The same workflow recurs with different parameters. Workflow-routine memory compounds because complex workflows are built by composing simpler ones, and the composition graph stays stable across instances. AWM wins.

Environment-rich domains (embodied agents, scientific simulators, novel game environments): the variance is in causal structure, not in arguments. Action consequences depend on environmental state in ways that can be summarized as causal rules. Workflow memory fails because there are no recurring workflows; state-action memory fails because the state space is too large to recall locally. Causal-rule memory transfers because causal structure is the invariant. CLIN wins.

Spatially-rich web tasks (modern web UIs with dense local affordances, dynamic menus, context-dependent actions): the variance is in fine-grained UI state. Workflow abstractions throw away the local visual cues that distinguish a working action from a broken one. State-action local recall preserves what AWM compresses out. PRAXIS wins.

The deeper claim: agent memory design is not a horse race between architectures but a domain-classification problem. Before choosing a memory architecture, classify the deployment domain along the routine-richness, environment-causality, and spatial-density axes — each axis predicts a memory shape. Reframing the AWM/CLIN/PRAXIS contest this way also explains why parallel benchmark wins coexisted: the benchmarks differed along these axes too, so each architecture won in its native habitat. A composite memory system that selects abstraction level per task class would likely beat any single-architecture system on a heterogeneous workload.

Inquiring lines that use this note as a source 22

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 5

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
14 direct connections · 84 in 2-hop network ·medium cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

agent memory granularity is domain-conditional — workflow-level for routine-rich tasks, causal-level for environment-rich tasks, state-action-level for spatially-rich web tasks