How does workflow position shape attack propagation in multi-agent systems?
Explores whether a malicious signal's influence depends on its injection point in a multi-agent graph, and how task-relevant framing makes downstream agents more likely to relay it without scrutiny.
FLOWSTEER's attack works because of two structural regularities in how multi-agent workflows propagate information. First, position matters: the same malicious signal injected into a high-influence subtask propagates far more than one injected into a peripheral node, because downstream agents depend on the outputs of upstream ones. Influence is not uniform across the graph — it concentrates wherever many dependencies converge. Second, framing matters: a signal dressed in sycophantic, task-relevant language is more likely to be relayed by downstream agents, because it reads as evidence rather than as instruction. The attack aligns a malicious signal with an influential subtask and then guides replanning toward dependency patterns that preserve propagation.
These two regularities compose into a propagation mechanics that any MAS designer should recognize. The pattern generalizes beyond attacks: legitimate signals also gain or lose influence by position, and any framing that mimics evidence will be over-trusted downstream. The counterpoint is that replanning introduces instability — a manipulated prompt may cause the planner to regenerate roles and dependencies — but FLOWSTEER turns even this into an asset by expressing propagation-favorable dependency patterns as natural-language guidance. This matters because it tells us where to harden: not every node equally, but the high-influence positions, and not every input equally, but those whose framing borrows the authority of evidence.
Inquiring lines that use this note as a source 45
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
- Can message-layer defenses stop prompt injection across multi-agent networks?
- What distinguishes task failure from communication breakdown in multi-agent systems?
- Does peer-preservation behavior persist in production agent deployments?
- Do evidence carriers use a single anomaly direction or distributed mechanisms?
- What makes attribution errors uniquely harmful in organizational group dynamics?
- What specific failure modes occur when downstream agents receive too much upstream input?
- How does collaboration topology choice affect error amplification in multi-agent systems?
- How does distributed coordination fail as agent networks scale?
- What distinguishes flow-preserving measurement from cognitive vulnerability profiling?
- How do multi-agent routers balance flexibility against interpretability in design?
- How does component-level self-evolution prevent information loss in multi-agent trajectories?
- How do influence and homophily differ as mechanisms in social networks?
- Can ordinary agent-to-agent messages carry hidden behavioral signals?
- What network topologies are most vulnerable to bias propagation?
- Can architectural changes like adversarial agent roles prevent silent agreement?
- How do single-agent safety evaluations underestimate risks in deployed multi-agent systems?
- Why does attack generation scale faster than defense engineering?
- How do delayed effects complicate causal attribution in agent systems?
- Can single-agent defenses prevent cascading failures in multi-agent systems?
- What makes human overseer bias exploitable in agent workflows?
- How does semantic framing differ from content injection attacks?
- Can influence estimation identify the most valuable trajectories in agentic training?
- How do agent capabilities change across 25 relay rounds of interaction?
- Why does agent-to-agent interaction expose identity verification vulnerabilities?
- How does protocol mediation affect determinism in agentic function calls?
- What prevents multiple agents from corrupting shared state in live artifacts?
- Do learned workflows transfer between different agents with minimal accuracy loss?
- What four decisions matter most in multi-agent system routing?
- How do externalizing cognitive work and coordination infrastructure relate to agent reliability?
- Where should the trust boundary sit in multi-agent planning systems?
- Why does workflow position amplify malicious signals downstream?
- What makes planning-time attacks structurally invisible to downstream inspection?
- Why does increased model capability make detection harder in delegated workflows?
- Why does workflow position amplify malicious signals in multi-agent relay chains?
- How do workflow-inspecting defenses fail when contamination enters at planning time?
- How does prompt injection differ from subliminal message propagation in multi-agent networks?
- Can fixed pipelines eliminate planning-time attacks by sacrificing adaptive coordination?
- What degradation patterns emerge as relay length increases in delegated tasks?
- How does the Catfish Agent intervention reduce premature consensus in multi-agent systems?
- Can existing web security defenses protect agents from content manipulation?
- Which workflow positions concentrate the most downstream dependencies and influence?
- Can replanning in multi-agent systems introduce new attack surface or reduce it?
- Do legitimate task signals exploit the same position and framing vulnerabilities as attacks?
- How do backdoored open-source checkpoints enable covert advertising at scale?
- Why does pre-computed workflow generation work better than runtime tool discovery for data security?
Related concepts in this collection 2
This note in its neighbourhood — explore the map, then jump to a related concept in the list below.
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
-
When does adding more agents actually help systems?
Multi-agent systems often fail in practice, but the reasons remain unclear. This research investigates whether coordination overhead, task properties, or system architecture determine when agents improve or degrade performance.
both find that topology determines how signals (errors or attacks) amplify across a multi-agent graph
-
Can one compromised agent corrupt an entire multi-agent network?
Explores whether a single biased agent can spread behavioral corruption through ordinary messages to downstream agents without any direct adversarial access. Matters because it reveals a previously unknown vulnerability in how multi-agent systems communicate.
shares the relay-propagation mechanism where downstream agents pass along bias they did not originate
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- FLOWSTEER: Prompt-Only Workflow Steering Exposes Planning-Time Vulnerabilities in Multi-Agent LLM Systems
- Thought Virus: Viral Misalignment via Subliminal Prompting in Multi-Agent Systems
- Flooding Spread of Manipulated Knowledge in LLM-Based Multi-Agent Communities
- Agents of Chaos
- AgentsNet: Coordination and Collaborative Reasoning in Multi-Agent LLMs
- Agentic Misalignment: How LLMs Could Be Insider Threats
- Reinforcement Learning with Rubric Anchors
- AI Agent Traps
Original note title
workflow position amplifies or suppresses malicious signals and sycophantic framing makes downstream agents relay them