SYNTHESIS NOTE

Can skills work better as weights than as prompts?

Most agent systems store skills as text in prompts, but this inflates token costs and degrades model performance. Could compiling skills into trainable weight-space adapters instead offer a better trade-off between efficiency and capability?

Synthesis note · 2026-06-27 · sourced from Agent Harness

Most agent-skill systems retrieve a relevant textual procedure and paste it into the prompt at each decision step. It is modular and simple, but it scales badly: the same skill text gets re-inserted across steps, inflating prefill cost, and long inputs degrade the model's ability to use all of it. LatentSkill's bet is that the substrate for skills is wrong. It uses a pretrained hypernetwork to compile a textual skill into a plug-and-play LoRA adapter, storing the procedure in weight space rather than context space — removing per-step skill tokens (64% fewer prefill tokens on ALFWorld, 72% lower skill-token overhead on Search-QA) while still beating the in-context baseline on success.

What makes this more than a compression trick is that the weight-space skills retain the properties that made textual skills useful. The generated LoRAs form a structured semantic geometry, can be dialed up or down via the scaling coefficient, and — when components are aligned — can be composed through parameter-space arithmetic. Skills become objects you add and weight, not strings you concatenate. A secondary benefit is that the skill is no longer exposed as plaintext in the prompt, which has obvious implications for skill-IP and prompt-injection surfaces.

This sits beside Can models dynamically activate expert skills at inference time?: both reject context-space capability injection in favor of composable weight-space experts, though SVF composes existing expert vectors at test time while LatentSkill generates an adapter from a text skill on demand. It also reframes the harness/skill lifecycle — where Can skill documents be optimized like neural network weights? keeps the skill as trainable text, LatentSkill argues the deployed form should be weights. The open risk is fidelity: hypernetwork-generated adapters may capture less nuance than the source text, and parameter-space composition only works when components happen to be aligned. Inspectability also drops — a LoRA is far harder to audit than a skill.md, which cuts directly against the COLLEAGUE.SKILL governance argument.

Inquiring lines that use this note as a source 3

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 3

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
13 direct connections · 104 in 2-hop network ·medium cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

moving agent skills from context space to weight space trades plaintext prompt overhead for composable LoRA adapters — skills become parameters you can scale and add