SYNTHESIS NOTE

What decisions must multi-agent routing systems optimize simultaneously?

Standard LLM routing only picks which model to use. But multi-agent systems involve four interdependent choices: topology, agent count, role assignment, and per-agent model selection. Does optimizing all four together actually improve performance?

Synthesis note · 2026-02-23 · sourced from Routers

Standard LLM routing (RouteLLM, Hybrid-LLM) optimizes a single decision: which model handles this query. MasRouter argues this is an incomplete optimization for multi-agent systems, where routing involves four simultaneous decisions:

Collaboration mode determination — choosing the optimal communication topology (Chain, Tree, Graph) for varying task complexities
Dynamic agent number — determining how many expert agents are required based on input difficulty
Agent role allocation — selecting suitable roles per agent according to the query domain
Agent LLM routing — assigning each agent the appropriate LLM backbone

The formal definition of Multi-Agent System Routing (MASR) integrates all four into a unified framework. MasRouter implements this through a cascaded controller network: a variational latent variable model routes the query to a collaboration module, a structured probabilistic cascade generates agent roles progressively, and a multinomial distribution model recommends LLM backbones per agent. The cascade is sequential by design — topology constrains which roles make sense, and roles constrain which LLMs are appropriate.

The results validate the multi-dimensional approach: MasRouter surpasses RouterDC (SOTA single-model routing) by 3.51% average accuracy while reducing HumanEval cost from $0.363 to $0.185 (49% reduction). The framework generalizes to unseen LLM backbones and collaboration modes, and integrates with mainstream MAS for 17-28% cost reduction.

Since Can AI systems design unique multi-agent workflows per individual query?, MasRouter provides a more structured alternative — FlowReasoner generates system designs via RL-trained code generation (maximum flexibility, less interpretability), while MasRouter's topology→role→LLM cascade provides interpretable intermediate decisions at the cost of fixed structure types. Since Can multi-agent teams automatically remove their weakest members?, DyLAN prunes within a running network while MasRouter constructs the optimal network from scratch — complementary approaches that could be composed (MasRouter for initial construction, DyLAN for runtime adaptation).

The formalization matters because it surfaces what single-model routing leaves on the table. Since When does adding more agents actually help systems?, routing to the right topology per query is MasRouter's direct response to topology-dependent error amplification — rather than accepting a fixed topology's scaling limitations, route around them.

Inquiring lines that read this note 12

This note is a source for these research framings, grouped by the broader line of inquiry each explores. Scan the bold lines of inquiry; follow any specific question forward.

Can model routing outperform monolithic scaling as an efficiency strategy?

What coordination failures limit multi-agent LLM systems as they scale?

What specific network sizes trigger coordination degradation in LLM systems?

How does reasoning graph topology affect breakthrough insights and generalization?

How should topology routing adapt to different task types?

How do multi-agent systems achieve genuine cooperation and reasoning?

When do multi-agent approaches outperform single model extended thinking?

Can construction-time routing and runtime agent pruning be combined effectively?

What determines success in training models on multiple tasks?

When and what should a model actually decide to delegate?

Related concepts in this collection 5

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map

13 direct connections · 81 in 2-hop network ·medium cluster Open in graph ↗

What decisions must multi-agent routing systems … Can AI systems design unique multi-agent workflows… Can multi-agent teams automatically remove their w… When does adding more agents actually help systems… Can routers select the right model before generati… Can semantic capability vectors replace manual age…

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Can AI systems design unique multi-agent workflows per individual query? Explores whether meta-agents trained with reinforcement learning can automatically generate personalized multi-agent system architectures tailored to individual user queries, rather than applying fixed task-level templates uniformly.
FlowReasoner: more flexible per-query design via RL code generation; MasRouter: more structured cascade
Can multi-agent teams automatically remove their weakest members? Explores whether agents can score each other's contributions during problem-solving and use those scores to deactivate underperforming teammates in real time, improving overall team efficiency.
runtime pruning complements construction-time routing
When does adding more agents actually help systems? Multi-agent systems often fail in practice, but the reasons remain unclear. This research investigates whether coordination overhead, task properties, or system architecture determine when agents improve or degrade performance.
MasRouter routes around topology limitations per query
Can routers select the right model before generation happens? Explores whether LLMs can be matched to queries by estimating difficulty upfront, before any generation begins. This matters because routing could cut costs significantly while preserving response quality.
single-model routing as the simplest case of MASR
Can semantic capability vectors replace manual agent routing? Explores whether embedding agent capabilities in high-dimensional space and matching them semantically can eliminate brittle, manually-maintained topic-based routing in multi-agent systems.
FoA elevates the routing primitives themselves to first-class artifacts: where MasRouter routes per query given a fixed agent pool, FoA enables semantic discovery over a dynamic capability-published ecosystem. The two are at different layers of the stack — MasRouter solves "which agent for this query"; FoA solves "what does each agent advertise it can do"

What decisions must multi-agent routing systems optimize simultaneously?

Inquiring lines that read this note 12

Related concepts in this collection 5

Related papers in this collection 8

Search by related questions 4