INQUIRING LINE

Agentic Systems and Tool Use · Model Architecture and Internals · Psychology, Society, and Alignmentcross-cluster

How do agent-created code artifacts become part of harness infrastructure?

This explores the path by which code an agent writes for itself during a task gets promoted into the durable scaffolding (the harness) that future runs and other agents rely on.

This explores the path by which code an agent writes for itself during a task gets promoted into the durable scaffolding (the harness) that future runs and other agents rely on. The cleanest way to see it is to split agent code into three layers: capability baked into the model's weights, the harness of infrastructure that connects the model's outputs to real actions, and the artifacts the agent itself writes mid-run What are the three distinct layers of agent code?. The interesting movement happens at the boundary between the last two — when something the agent scratched together to get through one task stops being throwaff and becomes part of the standing infrastructure everyone reuses.

That promotion step is the least-studied part of the whole stack. Agent-authored code that persists and is shared across agents is named explicitly as the underexplored layer, and the open problems cluster exactly where you'd expect: deciding what lives and what dies, keeping shared state consistent when many agents touch it, and the actual promotion decision from scratch work to durable infrastructure What makes agent-authored code worth persisting and sharing?. The reason code is the natural medium for this is that it is simultaneously executable, inspectable, and stateful — an agent can run it, read it back, and carry state forward, which is what lets a one-off script behave like a reusable tool rather than a dead transcript Can code serve as the operational substrate for agent reasoning?.

The most concrete mechanism for closing the loop is to make creation happen *inside* the agent's reasoning loop rather than as an offline authoring step. When skill creation is itself a tool the agent invokes mid-task, the new skill is grounded in the exact context that prompted it, validated against immediate runtime feedback, and — crucially — transfers to other agents with minimal loss Does creating skills inside the agent loop eliminate mismatches?. A related route is mining experience after the fact: inducing reusable sub-task routines from past trajectories and compounding them hierarchically, which yields large gains precisely because the agent stops re-deriving the same procedures every run Can agents learn reusable sub-task routines from past experience?. Both are versions of the same move — turn what worked once into standing infrastructure.

There's a twist worth knowing: the ability to *write* useful harness updates is roughly flat across model sizes, but the ability to *benefit* from them follows an inverted-U, peaking at mid-tier models. Weak models never invoke the new infrastructure; very strong ones stumble on faithfully following their own scaffolding Do stronger models always evolve harnesses better?. So promotion isn't free — an artifact only becomes real infrastructure if the agents downstream actually reach for it. That's also why what gets promoted should be governed rather than ambient: when rules live in the memory layer the agent consults during decisions, they actually get used, whereas external policy appendices get ignored Can governance rules embedded in runtime memory actually protect autonomous agents?.

The broader reframing underneath all of this is that agent reliability has migrated out of the model weights and into the surrounding harness of memory, skills, and protocols Where does agent capability really come from?. Once you accept that, the question stops being "how do we build a better harness up front" and becomes "how do we let agents grow their own harness as they work" — and the open frontier is the promotion decision itself: which scratch artifacts earn permanence, how shared state stays coherent, and how a tool authored by one agent becomes infrastructure for the rest.

Sources 8 notes

What are the three distinct layers of agent code?

Long-running agentic systems decompose into model-internal capabilities (trained reasoning), system-provided harness (infrastructure connecting outputs to actions), and agent-initiated artifacts (code created during execution). Each layer fails and improves differently, and this separation clarifies where to intervene.

What makes agent-authored code worth persisting and sharing?

Of three agentic code elements, agent-initiated artifacts that persist and are shared across agents remain underexplored. Open challenges cluster around lifecycle decisions, shared state consistency, and promotion from scratch work to durable infrastructure.

Can code serve as the operational substrate for agent reasoning?

Research shows code uniquely enables agent reasoning, action, and verification by being simultaneously executable, inspectable, and stateful. This unified code-centered loop improves reasoning and verification together compared to natural-language or prose-based approaches.

Does creating skills inside the agent loop eliminate mismatches?

MUSE-Autoskill demonstrates that invoking skill creation from within the agent's reasoning loop grounds new skills in exact task context, immediate feedback, and runtime validation. In-loop skills reach 87.94% task accuracy and transfer to other agents with minimal loss, eliminating the situated context problem of offline authoring.

Can agents learn reusable sub-task routines from past experience?

Agent Workflow Memory induces sub-task routines at finer granularity than full tasks, abstracts example-specific values, and compounds them hierarchically. This produces 24.6% relative gain on Mind2Web and 51.1% on WebArena, with larger gains as train-test gaps widen.

Do stronger models always evolve harnesses better?

Model capability to produce useful harness edits stays constant across tiers, but capacity to actually benefit from those edits follows an inverted U-shape, peaking in mid-tier models. Weak models fail to invoke harnesses; strong models struggle with faithful instruction-following.

Can governance rules embedded in runtime memory actually protect autonomous agents?

A persistent agent recorded 889 governance events across 96 active days, with safeguards encoded directly into the memory layer the agent consulted during operation. Runtime-resident governance proved more effective than external policies because the agent actually accessed it during decision-making.

Where does agent capability really come from?

Research shows that agent capability shifts from the model itself to the surrounding harness of memory, skills, and protocols. Reliability emerges from externalizing cognitive burden into structured scaffolding rather than scaling model weights.

How do agent-created code artifacts become part of harness infrastructure?

Sources 8 notes

Next inquiring lines