Can versioned capability vectors solve the discovery gap in existing protocols?
This explores whether embedding an agent's abilities as searchable, version-stamped vectors can fix the hardest part of agent protocols like MCP — letting one agent find the right collaborator or tool without someone hand-wiring the connections in advance.
This reads the question as being about discovery — the unglamorous problem of how an agent finds the right tool or sub-agent when nobody pre-registered the wiring. The corpus is genuinely split on whether versioned capability vectors are the answer, and that tension is the interesting part. The strongest 'yes' comes from work treating capability discovery as a first-class search operation: encode what each agent can do as a semantic vector, stamp it with a version, drop it into an approximate-nearest-neighbor index, and matching scales sub-linearly even as the population of agents gets wildly heterogeneous Can semantic capability vectors replace manual agent routing?. The key move is bundling policy and budget constraints into the vector itself, so 'who can do this' and 'who is allowed to and can afford to' resolve in one lookup rather than in brittle hand-written routing tables.
But there's a sharp counter-current. One line of practitioner research found that protocol-mediated tool access — exactly the MCP-style indirection this idea sits on top of — is itself the source of non-deterministic failures: ambiguous tool selection and shaky parameter inference. Their fix was to go the *other* way, replacing the protocol with explicit direct function calls and a single tool per agent, and a 306-team survey backed them up: 85% of production teams build custom agents rather than trust the frameworks Why do protocol-based tool integrations fail in production workflows?. Read together, these two notes suggest semantic discovery and production determinism are pulling in opposite directions — vectors buy you flexible matching at the cost of the predictability production teams are desperate for.
The most useful framing for resolving that tension is the 'wrap, don't replace' result: coordination standards that actually get adopted compose existing protocols like MCP under a shared substrate instead of competing to replace them, letting value accrue without forcing ecosystem-wide rewrites Should coordination protocols wrap existing systems or replace them?. Versioned capability vectors fit that mold better than they fit a rip-and-replace story — they're a discovery layer you can lay over what exists, and the *versioning* is what lets the substrate evolve without breaking callers. That's the bridge between the optimistic and pessimistic notes above.
The deeper question the corpus quietly raises is whether a vector can faithfully represent a capability at all. The skill-library work shows the pattern can work in practice: VOYAGER stores executable skills in an embedding-indexed library and composes complex skills from simple ones, which is structurally the same idea as a searchable capability index Can agents learn new skills without forgetting old ones?. And API-grounded generation shows agents can discover and orchestrate vetted capabilities on the fly without touching the underlying data, which is the discovery problem solved by a different route Can LLMs generate workflows without touching proprietary data?. The cautionary thread is that surface-level matching can hide broken internals — a representation can carry all the right decodable features while being fundamentally disorganized underneath, invisible until distribution shift Can models be smart without organized internal structure?.
So the honest answer: versioned capability vectors look like a real fix for the *discovery* gap specifically — the matching-and-routing problem — and the versioning plus the wrap-don't-replace strategy is what makes them deployable rather than yet another competing standard. What they don't solve, and what the corpus keeps flagging, is whether the thing you matched will then behave deterministically once invoked. Discovery and reliability are separate gaps, and solving the first cleanly can make it tempting to assume you've solved the second.
Sources 6 notes
Versioned Capability Vectors embedded in HNSW indices couple semantic matching with policy and budget constraints, making capability discovery a first-class operation that scales sub-linearly as agent heterogeneity increases.
MCP integration caused non-deterministic failures through ambiguous tool selection and parameter inference. Replacing it with explicit direct function calls and single-tool-per-agent design restored determinism. A 306-practitioner survey confirms 85% of production teams build custom agents, forgoing frameworks.
Research shows that agent coordination standards achieve adoption by composing existing protocols like MCP and DIDComm under a shared substrate, rather than competing to replace them. Bridging lets value accrue incrementally without forcing ecosystem-wide rewrites.
VOYAGER demonstrates that storing executable skills in an embedding-indexed library and composing complex skills from simpler ones allows agents to learn continuously while avoiding the forgetting that occurs with weight-update-based methods. Environmental feedback refines skills while an automatic curriculum drives continual exploration.
FlowMind demonstrates that LLMs can generate on-the-fly workflows for spontaneous tasks by orchestrating calls to vetted APIs rather than accessing data directly, eliminating confidentiality risks while maintaining high-level human inspection and feedback.
Models trained with SGD can contain all the linearly decodable features needed for a task while maintaining fundamentally broken internal organization. This makes them vulnerable to perturbation and distribution shift invisible to standard evaluation metrics.