TOPIC

Multi-Agent Architectures

25 synthesis notes · 43 source papers
View as

Can brain structure guide how we design intelligent agents?

Does mapping agent capabilities onto human brain functions provide a useful organizing framework for understanding and comparing different agent architectures? This matters because agents need a shared vocabulary to advance beyond one-off designs.

Explore related Read →

Should coordination protocols wrap existing systems or replace them?

Explores whether new agent coordination standards should integrate with existing protocols through bridging, or establish themselves as replacements. This shapes which standards survive and how quickly ecosystems can adopt them.

Explore related Read →

Why don't AI agents develop social structure at scale?

When millions of LLM agents interact continuously on a social platform, do they form collective norms and influence hierarchies like human societies? This tests whether scale and interaction density alone drive socialization.

Explore related Read →

Will inference compute soon exceed training compute demand?

As AI agents proliferate and test-time compute becomes mainstream, will inference—not training—become the dominant compute workload? This matters because it would invert how we think about AI system economics and design priorities.

Explore related Read →

When do agents need coordination more than raw capability?

As AI agents move beyond language tasks into economic and social roles—buying, deploying, transacting—does the bottleneck shift from model reasoning to infrastructure for coordination, governance, and accountability?

Explore related Read →

Can semantic capability vectors replace manual agent routing?

Explores whether embedding agent capabilities in high-dimensional space and matching them semantically can eliminate brittle, manually-maintained topic-based routing in multi-agent systems.

Explore related Read →

Can digital contexts persist as identity after someone dies?

Explores whether the traces people leave in digital systems—conversations, decisions, interactions—can form a lasting identity that persists and continues to interact with the world through AI, even after the person departs.

Explore related Read →

Can agents adapt without pausing service to users?

Can deployed LLM agents continuously improve their capabilities while serving users without interruption? This explores whether fast behavioral updates and slow policy learning can coexist across different timescales.

Explore related Read →

Can workflow inspection catch attacks that bias planning signals?

Does inspecting the final workflow catch attacks that contaminate earlier planning stages? This matters because contamination laundered through the planner may look legitimate by the time the workflow exists.

Explore related Read →

Why do multi-agent systems fail to coordinate at scale?

Explores how LLM agents struggle to synchronize strategy timing and validate information when coordinating across larger networks, revealing fundamental limits in distributed reasoning.

Explore related Read →

Can agents learn cooperation by adapting to diverse partners?

Explores whether sequence model agents can develop mutual cooperation strategies through in-context learning when trained against varied co-players, without explicit cooperation mechanisms or hardcoded assumptions.

Explore related Read →

What makes delegation work beyond just splitting tasks?

Delegation is more than task decomposition. What dimensions of a task—like verifiability, reversibility, and subjectivity—determine whether an agent can safely and effectively handle it?

Explore related Read →

Can agents share thoughts without converting them to text?

Can multi-agent systems exchange information through continuous hidden representations instead of language? This matters because text serialization loses information and slows inference.

Explore related Read →

Can LLM agent groups reliably reach consensus together?

Tests whether multi-agent LLM systems can achieve valid agreement in Byzantine consensus games, even under benign conditions with no conflicting preferences over outcomes.

Explore related Read →

Does agent confidence actually signal competence in deliberation?

Multi-agent systems rely on confidence to route influence between agents, but confidence may not reflect true competence. This matters because miscalibrated confidence could systematically mislead group decisions.

Explore related Read →

Can prompt injection reshape multi-agent workflow without touching infrastructure?

Explores whether an attacker can manipulate how a planner assigns tasks and routes coordination purely through prompt crafting, without modifying agents, tools, or messages. This matters because it identifies a planning-time vulnerability most defenses miss.

Explore related Read →

Does token spending drive multi-agent research performance?

Multi-agent systems outperform single agents substantially, but what actually accounts for that improvement? Is it intelligent coordination or simply spending more tokens on the same task?

Explore related Read →

When does adding more agents actually help systems?

Multi-agent systems often fail in practice, but the reasons remain unclear. This research investigates whether coordination overhead, task properties, or system architecture determine when agents improve or degrade performance.

Explore related Read →

Why do multi-agent LLM systems fail more than expected?

This research asks what specific failure modes cause multi-agent systems to underperform despite their promise. Understanding these failure patterns is essential for building more reliable collaborative AI systems.

Explore related Read →

Why do protocol-based tool integrations fail in production workflows?

Explores whether standardized tool protocols like MCP introduce non-determinism that undermines agent reliability, and what causes ambiguous tool selection in production systems.

Explore related Read →

Can a separate trained curator improve skill libraries better than frozen agents?

Explores whether decoupling skill curation from agent execution enables better long-term learning of what skills to keep, delete, or refine. Matters because manual curation doesn't scale and heuristic approaches lack feedback.

Explore related Read →

Can small language models handle most agent tasks?

Explores whether smaller, cheaper models are actually sufficient for the repetitive, scoped work that dominates deployed agent systems, rather than relying on large models by default.

Explore related Read →

Can language models discover new expertise through collaborative weight search?

Can model experts be composed through particle swarm optimization in weight space without training? This explores whether collaborative search can discover capabilities that no individual expert possesses.

Explore related Read →

Are multi-agent systems actually intelligent coordination or just token spending?

Does multi-agent performance come from better coordination strategies, or primarily from distributing tokens across parallel contexts? Understanding this distinction matters for deciding when to build multi-agent systems versus scaling single agents.

Explore related Read →

How does workflow position shape attack propagation in multi-agent systems?

Explores whether a malicious signal's influence depends on its injection point in a multi-agent graph, and how task-relevant framing makes downstream agents more likely to relay it without scrutiny.

Explore related Read →

Source papers 43

The Arxiv papers behind this sub-topic. Links may take you off-site to arxiv.org.