INQUIRING LINE

How can agents evolve their own skills without human input?

This explores how agents can improve their own capabilities — generating their own training signal, feedback, and curriculum — when no human is in the loop to label, demonstrate, or reward.


This explores how agents can get better on their own — manufacturing the feedback, curriculum, and skill edits that normally come from a human. The corpus converges on a clear first move: stop relying on human demonstrations. Agents trained only on curated expert data hit a ceiling, because they never face their own mistakes and can't generalize past what the curator imagined (Can agents learn beyond what their training data shows?). The alternative is to let the agent learn from the consequences of its own actions — treating the states that follow each action as a free supervision signal, which matches expert-trained baselines on half the data (Can agents learn from their own actions without external rewards?).

But where does the feedback come from if no human grades it? Two answers. One is to manufacture it through self-play: in a three-role loop, a Challenger keeps raising difficulty (acting as curriculum), a Judge gives pass/fail verdicts (acting as reward), and both sides rewrite their skills in plain language — no human supervision at any step (Can language models learn skills without human supervision?). The other is to harvest the unambiguous feedback the environment already gives. Reflexion shows an agent can turn a bare success/failure signal into a written self-diagnosis stored in memory, improving across attempts with no weight updates — and the binary signal matters because it blocks the agent from rationalizing its failure away (Can agents learn from failure without updating their weights?).

The surprising shift is that much of this self-evolution happens outside the model's weights entirely. AgentFly reframes learning as memory operations — case, subtask, and tool memories that do credit assignment without touching a single parameter, reaching ~88% on GAIA (Can agents learn continuously from experience without updating weights?). VOYAGER stores skills as executable code in a searchable library and composes hard skills out of easy ones, which sidesteps the catastrophic forgetting that weight-update methods suffer (Can agents learn new skills without forgetting old ones?). Once skills live in an external library rather than the weights, evolving them becomes a curation problem — and SkillOS shows a separately *trained* curator, decoupled from a frozen executor, pushes a repository away from verbose generic notes toward sharp execution logic and cross-task meta-strategies (Can a separate trained curator improve skill libraries better than frozen agents?). That learning can even pool across an entire user base, with a central evolver mining everyone's interaction traces into shared upgrades (How can agent systems share learned skills across users?).

Two caveats keep this honest. First, most of these loops aren't truly self-directed — the metacognition (how to plan, what to evaluate) is still a fixed human-designed scaffold that breaks under domain shift; genuine self-improvement would require agents that generate their own learning strategies, which the corpus flags as a real gap (Can AI systems improve their own learning strategies?). Second, who benefits from self-edits isn't uniform: the ability to *write* a useful harness update is flat across model sizes, but the ability to actually *use* one peaks at mid-tier models — weak models don't invoke their own improvements, strong ones over-interpret them (Do stronger models always evolve harnesses better?).

If you want the missing safety rail: letting agents rewrite their own prompts, tools, and memory is dangerous unless every change is versioned, attributable, and reversible — the Autogenesis Protocol turns self-improvement from an emergent accident into an auditable process you can roll back (How can agent self-evolution be made safe and auditable?).


Sources 11 notes

Can agents learn beyond what their training data shows?

Agents trained on static expert datasets cannot learn from their own failures or generalize beyond demonstrated scenarios because they never interact with environments during training. Competence is capped by what curators imagined, not by agent capacity.

Can agents learn from their own actions without external rewards?

Research across eight environments shows that agents can use future states from their own actions as supervision without external rewards, matching expert-dependent baselines with half the data and providing superior warm-starts for subsequent RL training.

Can language models learn skills without human supervision?

Ctx2Skill's three-role self-play loop manufactures missing feedback through internal signals: the Challenger escalates difficulty as curriculum, the Judge gives binary verdicts as reward, and both sides evolve via natural-language skill edits. Success requires balancing adversarial pressure against a generalization safeguard to prevent collapse.

Can agents learn from failure without updating their weights?

Reflexion demonstrates that unambiguous environmental feedback (success/failure) enables agents to write useful self-diagnoses and improve across episodes without parameter updates. The binary signal prevents rationalization, and keeping reflections uncompressed preserves their usability.

Can agents learn continuously from experience without updating weights?

AgentFly formalizes agent learning as a Memory-augmented MDP with three memory modules (case, subtask, tool) that enable credit assignment and policy improvement entirely through memory operations. The approach achieved 87.88% on GAIA validation without modifying LLM parameters.

Can agents learn new skills without forgetting old ones?

VOYAGER demonstrates that storing executable skills in an embedding-indexed library and composing complex skills from simpler ones allows agents to learn continuously while avoiding the forgetting that occurs with weight-update-based methods. Environmental feedback refines skills while an automatic curriculum drives continual exploration.

Can a separate trained curator improve skill libraries better than frozen agents?

SkillOS shows that separating a trainable curator from a frozen executor, grouped by task streams, causes skill repositories to shift from generic verbose additions toward actionable execution logic and cross-task meta-strategies. The trained curator generalizes across different executor backbones and domains.

How can agent systems share learned skills across users?

SkillClaw aggregates interaction trajectories across users, processes them through an autonomous evolver that identifies patterns and refines skills, then synchronizes updates system-wide. This converts siloed individual learning into shared capability improvement without manual curation.

Can AI systems improve their own learning strategies?

Current self-improvement methods use extrinsic, fixed metacognitive loops designed by humans that fail under domain shift or capability changes. True self-improvement requires agents to generate their own adaptive metacognitive knowledge, planning, and evaluation—a gap confirmed as a neglected research area across neuro-symbolic AI.

Do stronger models always evolve harnesses better?

Model capability to produce useful harness edits stays constant across tiers, but capacity to actually benefit from those edits follows an inverted U-shape, peaking in mid-tier models. Weak models fail to invoke harnesses; strong models struggle with faithful instruction-following.

How can agent self-evolution be made safe and auditable?

The Autogenesis Protocol treats prompts, tools, and memory as versioned, registered resources with explicit lifecycle and rollback capabilities. This governance layer decouples what evolves from how evolution occurs, making updates measurable, attributable, and reversible—turning self-improvement from an emergent side effect into a disciplined process.

Next inquiring lines