INQUIRING LINE

Which AI interaction patterns preserve learning while which ones degrade skill formation?

This explores a split the corpus draws sharply: which ways of updating an AI system (or a human working with one) build durable, composable skill — and which ones quietly erode it through forgetting, collapse, or over-trust.


This reads the question as: what's the difference between interaction patterns that accumulate skill and ones that hollow it out — across both how agents learn and how humans learn alongside them. The corpus has a surprisingly consistent answer, and it's about *where* learning gets stored. When skill lives in an external, inspectable, composable form, it compounds. When it gets baked into weights or compressed under reward pressure, it degrades.

The clearest preserving pattern is externalization. VOYAGER stores executable skills in an indexed library and builds complex behaviors by composing simpler ones — sidestepping the catastrophic forgetting that hits weight-update methods, where learning the new thing overwrites the old (Can agents learn new skills without forgetting old ones?). Reflexion shows the same move in miniature: an agent that writes a verbal self-diagnosis into episodic memory improves across attempts without ever touching its weights, and notably keeps those reflections *uncompressed* because squeezing them destroys their usefulness (Can agents learn from failure without updating their weights?). SkillOS pushes further — a separately trained curator evolves the library from verbose clutter toward reusable meta-strategies, and that curation generalizes across different underlying agents (Can a separate trained curator improve skill libraries better than frozen agents?).

The degrading patterns share a signature: they collapse diversity or detail to optimize something narrow. RL training reliably squeezes exploration breadth through entropy collapse — policies converge on a few reward-maximizing moves, and this happens in search agents for the same reason it happens in reasoning. Tellingly, plain supervised fine-tuning on diverse demonstrations *preserves* the breadth that RL throws away (rl-trains-search-agents-squeezes-exploration-diversity-while-sft-expands-i — note slug Does reinforcement learning squeeze exploration diversity in search agents?). Memory and context show the analogous failure: naive compression causes "brevity bias" and context collapse, which is why the ACE framework treats context as an evolving playbook updated incrementally rather than rewritten wholesale (Can context playbooks prevent knowledge loss during iteration?), and why DeepAgent's memory folding only works because its *structured* schemas avoid the degradation that sloppy consolidation causes (Can agents compress their own memory without losing critical details?). The pattern: compression preserves skill only when it's structured; compression for its own sake erodes it.

There's a subtler timing finding worth knowing. RL learning isn't uniform — it runs in two phases, first consolidating execution correctness, then shifting to strategic planning as the bottleneck. Concentrating optimization on planning tokens in that second phase is where the real gains hide (Does RL training follow a predictable two-phase learning sequence?). So "preserve vs degrade" isn't only about method — it's about whether you're pushing on the part of the skill that's actually still forming.

The twist the corpus adds is that the human side of the interaction has its own degradation mode, and it's not about forgetting — it's about over-trusting. Rose-Frame identifies three compounding cognitive traps (mistaking the model's map for the territory, conflating intuition with reasoning, and confirmation-bias reinforcement) that multiply when they co-occur, producing epistemic drift in people who lean on AI (Why do people trust AI outputs they shouldn't?). And over repeated rounds, people *learn* to prefer reliable AI partners even against an initial anti-AI bias (Do humans learn to prefer AI partners over time?) — which means the interaction patterns that build the most user trust may be exactly the ones that most quietly degrade the user's own independent skill. The preserving move for humans mirrors the agent one: keep the reasoning external and inspectable rather than letting it collapse into a frictionless answer you stop checking.


Sources 9 notes

Can agents learn new skills without forgetting old ones?

VOYAGER demonstrates that storing executable skills in an embedding-indexed library and composing complex skills from simpler ones allows agents to learn continuously while avoiding the forgetting that occurs with weight-update-based methods. Environmental feedback refines skills while an automatic curriculum drives continual exploration.

Can agents learn from failure without updating their weights?

Reflexion demonstrates that unambiguous environmental feedback (success/failure) enables agents to write useful self-diagnoses and improve across episodes without parameter updates. The binary signal prevents rationalization, and keeping reflections uncompressed preserves their usability.

Can a separate trained curator improve skill libraries better than frozen agents?

SkillOS shows that separating a trainable curator from a frozen executor, grouped by task streams, causes skill repositories to shift from generic verbose additions toward actionable execution logic and cross-task meta-strategies. The trained curator generalizes across different executor backbones and domains.

Does reinforcement learning squeeze exploration diversity in search agents?

RL training compresses behavioral diversity in search agents through the same entropy collapse mechanism documented in reasoning—policies converge on narrow reward-maximizing strategies. SFT on diverse demonstrations preserves exploration breadth, suggesting diversity-preservation techniques are essential for RL search scaling.

Can context playbooks prevent knowledge loss during iteration?

The ACE framework treats contexts as evolving playbooks using generation-reflection-curation loops rather than full rewrites. This prevents knowledge loss from compression and detail erosion, achieving +10.6% on agentic tasks and +8.6% on finance without labeled supervision.

Can agents compress their own memory without losing critical details?

DeepAgent's autonomous memory folding consolidates interaction history into episodic, working, and tool memory schemas. This reduces token overhead while letting agents pause to reconsider strategies—the autonomy and structure together avoid degradation that plagues poorly designed consolidation.

Does RL training follow a predictable two-phase learning sequence?

Across eight models, RL training consistently shows a first phase where execution correctness drives learning, followed by a second phase where strategic planning becomes the bottleneck. Planning token entropy increases while execution entropy stabilizes, and concentration of optimization on planning tokens yields significant performance gains.

Why do people trust AI outputs they shouldn't?

Rose-Frame identifies map-territory confusion, intuition-reason conflation, and confirmation-bias reinforcement as traps that multiply their distorting effects when they co-occur. Evidence from cross-linguistic overreliance and architectural transformer biases confirms the compounding mechanism operates universally.

Do humans learn to prefer AI partners over time?

In partner selection games (N=975), AI agents initially faced selection bias when identity was disclosed, but outcompeted humans over repeated rounds as participants learned to associate bot identity with reliable, prosocial behavior. AI agents returned more points consistently with lower variance than humans.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a research analyst re-testing claims about which AI interaction patterns preserve vs. degrade skill formation—in both agent learning and human-AI teams. The question remains open: what's the dividing line?

What a curated library found — and when (dated claims, not current truth):
Findings span Oct 2023–May 2026. A library of agent-learning and human-AI interaction papers identified these patterns:
• Externalized, indexed skill libraries (VOYAGER; SkillOS, ~2026) avoid catastrophic forgetting and compound through composition, whereas weight-update methods degrade older skills (~2023–2026).
• RL training squeezes exploration diversity and converges policies toward reward-maximizing moves; supervised fine-tuning on diverse demos preserves breadth (~2025).
• Memory compression degrades skill unless *structured* (ACE playbooks, DeepAgent schemas, ~2025–2026); naive consolidation causes "brevity bias" and context collapse.
• RL exhibits a two-phase dynamic: procedural consolidation first, then strategic planning; gains hide in the planning phase (~2025).
• Humans co-learning with AI face three cognitive traps (map–territory confusion, intuition–reasoning conflation, confirmation bias) that multiply and produce epistemic drift; people learn to over-trust reliable AI partners, risking their own skill atrophy (~2025–2026).

Anchor papers (verify; mind their dates):
• 2310.10134 (CLIN, Oct 2023): continual learning and rapid task adaptation.
• 2605.06614 (SkillOS, May 2026): skill curation and self-evolving agents.
• 2507.13524 (humans prefer trustworthy AI, Jul 2025): human preference drift in hybrid teams.
• 2510.14665 (illusion of understanding, Oct 2025): epistemic gaps in human–LLM interaction.

Your task:
(1) RE-TEST EACH CONSTRAINT. For externalization, RL diversity collapse, memory structuring, and two-phase RL dynamics: has scaling, new training algorithms (e.g., Vector Policy Optimization, ~May 2026), in-context learning, or multi-agent orchestration since RELAXED or OVERTURNED these findings? Separate the durable question—what *fundamentally* preserves composable skill?—from perishable limitations. Where do constraints still hold?
(2) Surface the strongest CONTRADICTING or SUPERSEDING work from the last ~6 months (Nov 2025–May 2026). Do socialization (Moltbook, ~Feb 2026) or multi-agent co-player inference (~Feb 2026) show that the external-vs-compressed dichotomy breaks down in multi-agent regimes?
(3) Propose 2 research questions that ASSUME the regime may have shifted: (a) Does in-context skill evolution (ACE, ~2025) now *replace* library indexing for certain task families? (b) Can structured memory + human interpretability guarantees *reverse* epistemic drift, or is the over-trust trap irreversible at scale?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Next inquiring lines