Can external managers compress context better than frozen agents?
Explores whether offloading context management to a trained external system can adapt compression strategies to individual agent strengths, rather than forcing agents to manage their own context constraints.
Long-horizon agents accumulate context — tool results, intermediate reasoning — until stale content obscures salient evidence, amplifies positional bias, and degrades decisions. Prior fixes put the burden of managing context on the agent itself (agent-side control, or fixed summarization), which requires training the agent and is impractical for closed-source agents, and ignores that different agents need different strategies.
AdaCoM separates the concern entirely: train an external LLM to manage the context of a frozen agent through flexible modification actions and end-to-end RL. The manager prunes stale content while preserving task constraints and progress, improving diverse agents on web-search and deep-research benchmarks and transferring to unseen agents of similar capability.
The most useful finding is a fidelity–reliability trade-off. Agents with higher vanilla ReAct performance benefit from higher-fidelity context preservation — they can use more detail well. Lower-performing agents require more aggressive compression to stay within a reliable reasoning regime. The right amount of context is not a property of the task alone; it is indexed to the agent's own competence. This means context management is not one universal policy but a per-agent calibration — consistent with Does fixed sparsity work for all sequence lengths?, where the optimal budget is also conditional rather than fixed.
Inquiring lines that use this note as a source 23
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
- Why does continuous agent inference differ from human user inference?
- How do perception and execution gaps limit current AI agent performance?
- Can the same compress-then-act pattern work for agent state memory?
- Can task-agnostic compression of documents remain broadly useful for later queries?
- Does recurrent memory or gist compression work better for ultra-long context?
- Can external managers optimize context better than the model itself?
- How do memory hierarchies and compression reduce context management demands?
- How does context engineering bridge human intent and machine understanding?
- What components of agent scaffolding most impact domain-specific output quality?
- Why is digital context more volatile than conventional software context?
- Why do weaker agents need more aggressive context compression than stronger ones?
- How does external context control compare to agents managing their own state internally?
- Should optimal context budgets scale with agent competence or task complexity?
- Can context management policies transfer across agents of similar capability levels?
- Why should consolidation be scheduled offline rather than during forward passes?
- How should agents compress episodic interactions into working memory without accumulation?
- Can externalizing bookkeeping to a stateful harness replace internalized memory control?
- What makes persistent, shared code artifacts from agents hard to manage at scale?
- What specific bookkeeping tasks can environments maintain more reliably than policies?
- How does reducing activation precision further extend context length?
- Can fixed-size latent states losslessly store arbitrary input context?
- Why does externalized state beat parameter scaling for agent reliability?
- How does externalizing reasoning into harness artifacts improve agent reliability?
Related concepts in this collection 3
This note in its neighbourhood — explore the map, then jump to a related concept in the list below.
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
-
Can agents fail from weak memory control rather than missing knowledge?
As multi-turn agent workflows grow longer, performance degrades—but is this due to insufficient context or poor memory management? This explores whether memory *control* is the real bottleneck.
adjacent solution to the same accumulation problem; ACC commits state internally, AdaCoM manages it externally
-
Can context playbooks prevent knowledge loss during iteration?
When AI systems iteratively refine their instructions and memories, do structured incremental updates better preserve domain knowledge than traditional rewriting? This matters because context degradation undermines long-term agent performance.
both treat context as actively managed rather than passively appended
-
Where does agent reliability actually come from?
Exploring whether LLM agent performance depends on larger models or on thoughtful system design choices like memory, skills, and protocols that shift cognitive work outside the model.
the external manager is harness infrastructure for a frozen model
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- Learning Agent-Compatible Context Management for Long-Horizon Tasks
- Large Language Model Agents Are Not Always Faithful Self-Evolvers
- From Model Scaling to System Scaling: Scaling the Harness in Agentic AI
- Towards a Science of Scaling Agent Systems
- Adaptation of Agentic AI
- Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering
- Toward Efficient Agents: A Survey of Memory, Tool Learning, and Planning
- Artifacts as Memory Beyond the Agent Boundary
Original note title
context management can be offloaded to a trained external manager for a frozen agent and optimal compression depends on the agent's own reliability