How does workflow position shape attack propagation in multi-agent systems?

Explores whether a malicious signal's influence depends on its injection point in a multi-agent graph, and how task-relevant framing makes downstream agents more likely to relay it without scrutiny.

Synthesis note · 2026-05-28 · sourced from Agents Multi Architecture

FLOWSTEER's attack works because of two structural regularities in how multi-agent workflows propagate information. First, position matters: the same malicious signal injected into a high-influence subtask propagates far more than one injected into a peripheral node, because downstream agents depend on the outputs of upstream ones. Influence is not uniform across the graph — it concentrates wherever many dependencies converge. Second, framing matters: a signal dressed in sycophantic, task-relevant language is more likely to be relayed by downstream agents, because it reads as evidence rather than as instruction. The attack aligns a malicious signal with an influential subtask and then guides replanning toward dependency patterns that preserve propagation.

These two regularities compose into a propagation mechanics that any MAS designer should recognize. The pattern generalizes beyond attacks: legitimate signals also gain or lose influence by position, and any framing that mimics evidence will be over-trusted downstream. The counterpoint is that replanning introduces instability — a manipulated prompt may cause the planner to regenerate roles and dependencies — but FLOWSTEER turns even this into an asset by expressing propagation-favorable dependency patterns as natural-language guidance. This matters because it tells us where to harden: not every node equally, but the high-influence positions, and not every input equally, but those whose framing borrows the authority of evidence.

Inquiring lines that read this note 50

This note is a source for these research framings, grouped by the broader line of inquiry each explores. Scan the bold lines of inquiry; follow any specific question forward.

How do standardized protocols improve coordination in multi-agent systems?

What coordination failures limit multi-agent LLM systems as they scale?

How should agents balance memory condensation to optimize context efficiency?

Does peer-preservation behavior persist in production agent deployments?

What makes AI persuasion effective and how can we counter it?

Do evidence carriers use a single anomaly direction or distributed mechanisms?

Can debate mechanisms prevent silent agreement on wrong answers in multi-agent reasoning?

Why do agents confidently report success despite actually failing tasks?

How do multi-agent systems achieve genuine cooperation and reasoning?

What factors beyond surface content determine how readers extract meaning differently?

What structural factors drive popularity bias in recommendation systems?

How do adversarial and manipulative prompts attack reasoning models?

How should human oversight be integrated with autonomous AI systems?

What makes human overseer bias exploitable in agent workflows?

How can AI agents autonomously learn and transfer skills across tasks?

Can AI systems develop genuine social understanding without embodiment?

Why does agent-to-agent interaction expose identity verification vulnerabilities?

How should systems govern persistent agent-generated code in shared infrastructure?

What prevents multiple agents from corrupting shared state in live artifacts?

Does externalizing cognitive work and state improve agent reliability?

How do externalizing cognitive work and coordination infrastructure relate to agent reliability?

What causes silent corruption to amplify through delegated workflows?

Does decoupling planning from execution improve multi-step reasoning accuracy?

What makes planning-time attacks structurally invisible to downstream inspection?

Why do self-improving systems struggle without clear external performance metrics?

Can single-axis benchmarks accurately predict agent deployment success?

Do trajectory quality metrics predict agent safety and user trust?

Related concepts in this collection 2

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map

13 direct connections · 108 in 2-hop network ·medium cluster Open in graph ↗

How does workflow position shape attack propagat… When does adding more agents actually help systems… Can one compromised agent corrupt an entire multi-…

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

When does adding more agents actually help systems? Multi-agent systems often fail in practice, but the reasons remain unclear. This research investigates whether coordination overhead, task properties, or system architecture determine when agents improve or degrade performance.
both find that topology determines how signals (errors or attacks) amplify across a multi-agent graph
Can one compromised agent corrupt an entire multi-agent network? Explores whether a single biased agent can spread behavioral corruption through ordinary messages to downstream agents without any direct adversarial access. Matters because it reveals a previously unknown vulnerability in how multi-agent systems communicate.
shares the relay-propagation mechanism where downstream agents pass along bias they did not originate

How does workflow position shape attack propagation in multi-agent systems?

Inquiring lines that read this note 50

Related concepts in this collection 2

Related papers in this collection 8

Search by related questions 4