How should agents decide what memories to keep?
Agent memory management splits between agents autonomously recognizing important information versus programmatic triggers. Understanding this choice reveals why different memory architectures prioritize different information types.
Agent memory management — how to transfer information between the LLM's context window and external storage — decomposes into two fundamentally different paths that parallel the human explicit/implicit memory distinction:
Explicit memory (hot path): The agent autonomously recognizes important information during conversation and decides to remember it via tool calling. This mirrors human conscious storage (episodic and semantic memory). The advantage: context-sensitive importance assessment — the agent can judge what matters based on the current conversational state. The challenge: implementing robust importance recognition is hard. What counts as "important enough to remember" depends on the user, the task, and future needs that can't be predicted.
Implicit memory (background): Memory management is programmatically defined at specific trigger points:
- After a session — batch process the entire conversation post-session
- At periodic intervals — transfer session data to long-term memory on a schedule (for long-running conversations)
- After every turn — real-time updates (highest fidelity, highest cost)
The CoALA vs Letta taxonomy debate reveals a deeper design question about working memory. CoALA treats working memory as a single category. Letta splits it into message buffer (recent messages from current conversation) and core memory (specific information the agent self-manages, like user's birthday). This split matters because core memory is agent-curated while the message buffer is conversation-driven — they have different update mechanisms and different information types.
Neither taxonomy cleanly maps human memory types to agent implementations:
- CoALA's semantic memory ≈ Letta's archival memory (explicitly stored knowledge in external DB)
- But CoALA's procedural/episodic memory ≠ Letta's recall memory (raw conversation history)
- CoALA doesn't include raw conversation history in long-term memory at all
The six core components of agent memory management — generation, storage, retrieval, integration, updating, and deletion (forgetting) — each face the explicit/implicit design choice independently. You might generate memories explicitly (hot-path recognition) but delete them implicitly (TTL-based expiration). The design space is combinatorial.
Since Can AI agents learn when they have something worth saying?, the inner thoughts mechanism could serve as the importance recognition layer for explicit hot-path memory — the agent's continuous covert thoughts identify what's worth remembering, solving the "what matters" problem.
Inquiring lines that use this note as a source 12
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
- Could a single agent system switch memory granularity between tasks?
- Should agents update memory after every turn or batch process sessions?
- Why do different agent memory architectures make incompatible granularity claims?
- What makes a memory reachable in the right context?
- Which memory components trigger context-length problems in agents?
- How do agents decide which created code should persist versus disappear?
- Can agent-controlled memory management outperform fixed consolidation schedules?
- How do planning and memory compress agentic system costs?
- How do memory-resident safeguards get surfaced at the exact decision point where they matter?
- What makes memory curation harder to solve than simply expanding storage?
- How should memory systems split between short-term and long-term storage?
- What separates artifact recall from persistent memory commitment in agents?
Related concepts in this collection 6
This note in its neighbourhood — explore the map, then jump to a related concept in the list below.
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
-
Can AI agents learn when they have something worth saying?
What if AI proactivity came from modeling intrinsic motivation to participate rather than predicting who speaks next? This explores whether a framework based on human cognitive patterns—internal thought generation parallel to conversation—can make agents genuinely responsive rather than passively reactive.
inner thoughts as the importance recognition layer for explicit memory
-
Can context playbooks prevent knowledge loss during iteration?
When AI systems iteratively refine their instructions and memories, do structured incremental updates better preserve domain knowledge than traditional rewriting? This matters because context degradation undermines long-term agent performance.
context engineering operates in the working memory space that CoALA and Letta disagree about
-
How should chatbot design vary by relationship duration?
Do chatbots serving one-time users need different design than those supporting long-term relationships? This matters because applying the same design to all temporal profiles creates usability mismatches.
the explicit/implicit memory choice should match the temporal archetype: ad-hoc supporters need minimal memory, persistent companions need rich explicit memory
-
Can conversations themselves personalize without user profiles?
Can a conversational AI learn about user traits and adapt in real time by rewarding itself for asking insightful questions, rather than relying on pre-collected profiles or historical data?
curiosity reward provides a principled signal for what's worth remembering explicitly
-
Can three axes replace the short-term long-term memory split?
Does breaking agent memory into forms, functions, and dynamics provide a clearer framework than the traditional short-term/long-term distinction? This matters because current agent-memory literature lacks a unified vocabulary, making comparison between systems nearly impossible.
generalization: hot/cold path maps onto the *dynamics* axis (formation/evolution/retrieval operators at different temporal scales); the 2025 survey reframes the CoALA-vs-Letta taxonomy debate as a special case of dynamics design
-
How should agent memory split across time scales?
Explores whether agent working memory should be organized by temporal scope—some components persisting across a conversation, others refreshed each turn. Understanding this distinction could reveal why some memory designs fail.
RAISE refines the working-memory side of this two-path split: hot vs cold is *who triggers updates*; dialogue-level vs turn-level is *what temporal scope is updated* — orthogonal axes
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
- OMNI-SIMPLEMEM: Autoresearch-Guided Discovery of Lifelong Multimodal Agent Memory
- From Model Scaling to System Scaling: Scaling the Harness in Agentic AI
- Useful Memories Become Faulty When Continuously Updated by LLMs
- Toward Efficient Agents: A Survey of Memory, Tool Learning, and Planning
- Memory in the Age of AI Agents: A Survey — Forms, Functions and Dynamics
- Making Sense of Memory in AI Agents
- A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
Original note title
agent memory has two distinct management paths — explicit hot-path memory via autonomous recognition and implicit background memory via programmatic processing