What decisions must multi-agent routing systems optimize simultaneously?
Standard LLM routing only picks which model to use. But multi-agent systems involve four interdependent choices: topology, agent count, role assignment, and per-agent model selection. Does optimizing all four together actually improve performance?
Standard LLM routing (RouteLLM, Hybrid-LLM) optimizes a single decision: which model handles this query. MasRouter argues this is an incomplete optimization for multi-agent systems, where routing involves four simultaneous decisions:
- Collaboration mode determination — choosing the optimal communication topology (Chain, Tree, Graph) for varying task complexities
- Dynamic agent number — determining how many expert agents are required based on input difficulty
- Agent role allocation — selecting suitable roles per agent according to the query domain
- Agent LLM routing — assigning each agent the appropriate LLM backbone
The formal definition of Multi-Agent System Routing (MASR) integrates all four into a unified framework. MasRouter implements this through a cascaded controller network: a variational latent variable model routes the query to a collaboration module, a structured probabilistic cascade generates agent roles progressively, and a multinomial distribution model recommends LLM backbones per agent. The cascade is sequential by design — topology constrains which roles make sense, and roles constrain which LLMs are appropriate.
The results validate the multi-dimensional approach: MasRouter surpasses RouterDC (SOTA single-model routing) by 3.51% average accuracy while reducing HumanEval cost from $0.363 to $0.185 (49% reduction). The framework generalizes to unseen LLM backbones and collaboration modes, and integrates with mainstream MAS for 17-28% cost reduction.
Since Can AI systems design unique multi-agent workflows per individual query?, MasRouter provides a more structured alternative — FlowReasoner generates system designs via RL-trained code generation (maximum flexibility, less interpretability), while MasRouter's topology→role→LLM cascade provides interpretable intermediate decisions at the cost of fixed structure types. Since Can multi-agent teams automatically remove their weakest members?, DyLAN prunes within a running network while MasRouter constructs the optimal network from scratch — complementary approaches that could be composed (MasRouter for initial construction, DyLAN for runtime adaptation).
The formalization matters because it surfaces what single-model routing leaves on the table. Since When does adding more agents actually help systems?, routing to the right topology per query is MasRouter's direct response to topology-dependent error amplification — rather than accepting a fixed topology's scaling limitations, route around them.
Inquiring lines that use this note as a source 11
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
- Should model routing decisions account for prompt-tier dependencies?
- Can model routing and compute allocation work together as independent optimizations?
- What specific network sizes trigger coordination degradation in LLM systems?
- How should topology routing adapt to different task types?
- What structural constraints does topology impose on role and LLM assignment?
- Can construction-time routing and runtime agent pruning be combined effectively?
- How do multi-agent routers balance flexibility against interpretability in design?
- How do routers decide when to escalate from small to large models?
- Can multiple small models outperform a single large model with good routing?
- How does routing decide between models before generation happens?
- What four decisions matter most in multi-agent system routing?
Related concepts in this collection 5
This note in its neighbourhood — explore the map, then jump to a related concept in the list below.
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
-
Can AI systems design unique multi-agent workflows per individual query?
Explores whether meta-agents trained with reinforcement learning can automatically generate personalized multi-agent system architectures tailored to individual user queries, rather than applying fixed task-level templates uniformly.
FlowReasoner: more flexible per-query design via RL code generation; MasRouter: more structured cascade
-
Can multi-agent teams automatically remove their weakest members?
Explores whether agents can score each other's contributions during problem-solving and use those scores to deactivate underperforming teammates in real time, improving overall team efficiency.
runtime pruning complements construction-time routing
-
When does adding more agents actually help systems?
Multi-agent systems often fail in practice, but the reasons remain unclear. This research investigates whether coordination overhead, task properties, or system architecture determine when agents improve or degrade performance.
MasRouter routes around topology limitations per query
-
Can routers select the right model before generation happens?
Explores whether LLMs can be matched to queries by estimating difficulty upfront, before any generation begins. This matters because routing could cut costs significantly while preserving response quality.
single-model routing as the simplest case of MASR
-
Can semantic capability vectors replace manual agent routing?
Explores whether embedding agent capabilities in high-dimensional space and matching them semantically can eliminate brittle, manually-maintained topic-based routing in multi-agent systems.
FoA elevates the routing primitives themselves to first-class artifacts: where MasRouter routes per query given a fixed agent pool, FoA enables semantic discovery over a dynamic capability-published ecosystem. The two are at different layers of the stack — MasRouter solves "which agent for this query"; FoA solves "what does each agent advertise it can do"
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- MasRouter: Learning to Route LLMs for Multi-Agent Systems
- Single-Agent LLMs Outperform Multi-Agent Systems on Multi-Hop Reasoning Under Equal Thinking Token Budgets
- Multi-Agent Systems are Mixtures of Experts: Who Becomes an Influencer?
- Drop the Hierarchy and Roles: How Self-Organizing LLM Agents Outperform Designed Structures
- Dynamic LLM-Agent Network: An LLM-agent Collaboration Framework with Agent Team Optimization
- Scaling Behavior of Single LLM-Driven Multi-Agent Systems
- Towards a Science of Scaling Agent Systems
- AgentsNet: Coordination and Collaborative Reasoning in Multi-Agent LLMs
Original note title
multi-agent system routing requires four simultaneous decisions — collaboration topology agent count role allocation and per-agent LLM selection