Can person-grounded skills remain auditable without hidden prompt state?
Explores whether treating extracted expertise as versioned files—rather than persona prompts—enables meaningful accountability over person-grounded knowledge. Matters because audit trails determine whether captured skills can be corrected, rolled back, or safely withheld.
The interesting move in COLLEAGUE.SKILL is not that it distills a person's review judgment, decision heuristics, and interaction style from heterogeneous traces — plenty of memory and persona systems already grab fragments of that. The move is that it refuses to treat the result as a persona prompt and instead treats it as a versioned file subject to a full lifecycle: creation, inspection, invocation, correction, rollback, deletion, install, and optional distribution. Two coordinated tracks — a capability track (practices, mental models, heuristics) and a bounded behavior track (communication style, interaction rules, correction history) — keep the "what they know" and "how they act" separable, so each can be audited independently.
Why this matters: a single prompt can mimic surface behavior, but it makes the extracted knowledge unaccountable — you cannot point to where a claim came from, repair it, or refuse to ship it. The generation effect is the same critique I make of vault notes: passive transfer is not understanding. Here it becomes a governance argument. Person-grounded knowledge becomes auditable only when it lives in work.md and persona.md files that can be diffed, not hidden in prompt state.
This is the human-expertise end of the harness/skill lifecycle. Where Can skill documents be optimized like neural network weights? treats a skill file as trainable external state, COLLEAGUE.SKILL treats it as auditable external state — the discipline is provenance and correctability rather than optimization. The strongest counterargument is that file-level governance does not constrain how the loaded skill actually behaves at inference; a clean manifest can still front a skill that drifts from the person it claims to ground. Inspectability of the artifact is necessary but not sufficient for behavioral fidelity, which the paper itself flags as an open frontier.
Inquiring lines that use this note as a source 5
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
- What makes skills worth externalizing into a persistent harness?
- Does inspectable skill artifacts guarantee the behavior matches the person it claims to ground?
- How do capability tracks and behavior tracks stay separable during skill deployment?
- What makes passive prompt transfer fail as a substitute for auditable expertise?
- How can post-training research become reproducible without releasing full interfaces?
Related concepts in this collection 3
This note in its neighbourhood — explore the map, then jump to a related concept in the list below.
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
-
Can skill documents be optimized like neural network weights?
Can natural-language skill documents be treated as trainable parameters and improved through iterative optimization with validation gating, similar to how model weights are tuned in deep learning?
convergent-with: both treat a skill as an editable external-state file with explicit gating, here governance/audit rather than optimization
-
Can codified expertise let non-experts match specialist output?
When domain knowledge is captured as explicit rules and principles in an AI agent's scaffolding, can non-experts produce work at expert quality levels without consuming scarce specialist time? This explores whether structured knowledge codification dissolves organizational bottlenecks.
exemplifies the same expertise-externalization claim, here grounded in a specific person rather than abstract design rules
-
Can agents learn new skills without forgetting old ones?
Explores whether externalized skill libraries—storing learned behaviors as retrievable code rather than parameter updates—can solve the catastrophic forgetting problem that plagues continual learning systems.
extends the skill-lifecycle framing toward person-grounded artifacts with rollback and withholding
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- COLLEAGUE.SKILL: Automated AI Skill Generation via Expert Knowledge Distillation
- SkillOS: Learning Skill Curation for Self-Evolving Agents
- Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering
- MUSE-Autoskill: Self-Evolving Agents via Skill Creation, Memory, Management, and Evaluation
- A Practical Guide for Designing, Developing, and Deploying Production-Grade Agentic AI Workflows
- From Chatbot to Digital Colleague: The Paradigm Shift Toward Persistent Autonomous AI
- The LLM Fallacy: Misattribution in AI-Assisted Cognitive Workflows
- LatentSkill: From In-Context Textual Skills to In-Weight Latent Skills for LLM Agents
Original note title
person-grounded skills demand the same file-level lifecycle as any other artifact — inspect, correct, rollback, and withhold, not just generate