Multi-Agent

Single-agent systems hit a ceiling. Context fills, tool sets conflict, and latency compounds when one model tries to do everything. Multi-agent architectures break a problem into scoped responsibilities — each handled by an agent with its own context, tools, and constraints. The pattern is not universally better. On sequential reasoning tasks, decomposition can degrade performance by 39-70%. Start with the failure modes of your single agent before you reach for a second one.


Memory Over Messaging

This is the single most important architectural decision for a multi-agent system. The instinct is to design chatty networks that mirror human collaboration — agents sending messages, negotiating, clarifying. That is slow, token-expensive, and brittle.

FactorMessagingShared State
Token costDuplicates context across every exchangeEach agent reads only what it needs
LatencySequential round-trips between agentsAgents operate independently on latest state
DebuggingReconstruct from scattered conversation logsSingle source of truth, inspectable at any point
Fault toleranceLost message breaks the chainAgent retries by re-reading current state
ScalabilityO(n²) message pathsO(n) state readers/writers

Agents read from and write to shared state — filesystem, session dict, graph state, database. They do not talk to each other; they talk to the state. Design around data flow, not conversation flow. Memory engineering — deciding what state to expose, when to persist it, how to scope access — matters more than optimizing inter-agent prompts.


Orchestration Patterns

Every multi-agent system is a variation or composition of four patterns.

PatternOne-line descriptionImplementations
Orchestrator/WorkersCentral agent decomposes a task, delegates to specialists, synthesizes results.Claude Code Task tool; OpenAI agent-as-tool; ADK AgentTool; LangGraph supervisor node
Handoff ChainAgent A finishes its stage, then transfers full control to Agent B — no central coordinator.OpenAI handoffs (first-class); ADK transfer_to_agent; LangGraph conditional edges over shared State
Parallel Fan-OutMultiple agents run simultaneously on independent subtasks; a gather step merges outputs.Claude Code --background; ADK ParallelAgent; CrewAI Process.parallel; LangGraph parallel branches with join
Peer MeshAgents discover and invoke each other directly without a central orchestrator. Most flexible, hardest to debug.A2A protocol (framework-agnostic, HTTP + signed Agent Cards)

Start with the simplest pattern that solves the problem. A single orchestrator with two workers handles most cases. Add complexity only when you observe bottlenecks, not in anticipation of them.

Orchestrator/Workers in action

sequenceDiagram
    participant U as User
    participant O as Orchestrator
    participant W1 as Research Agent
    participant W2 as Code Agent
    participant W3 as Review Agent

    U->>O: "Add OAuth to the API"
    O->>W1: Explore auth patterns in codebase
    W1-->>O: Found: session-based auth in src/auth/
    O->>W2: Implement OAuth flow
    W2-->>O: Created 3 files, updated 2
    O->>W3: Review changes for security
    W3-->>O: LGTM, 1 suggestion
    O->>W2: Apply suggestion
    W2-->>O: Done
    O-->>U: OAuth implemented, PR ready

Delegation With Isolation

Each sub-agent needs strict boundaries defined at spawn time. Sharing a single context window, tool set, and permission level across agents defeats the purpose of decomposition.

Context isolation and model selection are the easy parts. Tool-registry isolation is the load-bearing one. A sub-agent with the full parent toolset is not a sub-agent — it is the parent under a different name, with a fresh context window and the same blast radius.


Communication Protocols

ProtocolScopeMechanismWhen to use
MCPAgent ↔ toolJSON-RPC 2.0 over stdio or Streamable HTTP. Universal tool discovery and invocation.Giving any agent access to any tool. Does not handle agent-to-agent communication.
A2AAgent ↔ agentAgents publish signed Agent Cards at /.well-known/agent-card.json; peers discover and invoke over HTTP. v1.2, donated to the Linux Foundation June 2025, 150+ adopting organizations.Peer Mesh, cross-framework or cross-org interop, long-running delegated tasks.
Shared StateImplicit coordinationKey-value store, filesystem, database, or graph state. Agents react to state changes.Most multi-agent architectures (memory-over-messaging). Decoupled, easy to extend.
Agent-as-ToolParent → childOne agent calls another as a tool: sends a prompt, receives a result. Called agent has no autonomy.Orchestrator/Workers inside a single framework. Familiar tool-call ergonomics.

A2A task lifecycle

A2A defines a stateful task lifecycle for cross-agent work:

submitted → working → input-required → auth-required
                    → completed | failed | canceled | rejected

input-required and auth-required are the states that make A2A more than a remote function call — they let a delegated agent suspend, ask its caller (or a human) for clarification or credentials, and resume. This matters for long-running cross-org delegation where a task can sit for hours waiting on a human signature or an OAuth approval.


Performance

Multi-agent architectures show roughly 45% faster problem resolution and 60% more accurate outcomes than equivalent single-agent implementations on suitable tasks, driven by specialization, parallelism, and review loops. Token efficiency improves 30-50% from smaller per-agent context windows. The caveat is sharp: on tasks that are fundamentally sequential — long chains of reasoning where each step depends on the last — decomposition can degrade performance by 39-70%. Multi-agent is task-dependent, not universally superior. If your problem is one long thought, one long thought is what it needs.


Choosing a Pattern

If your problem…Use…
Has independent subtasksParallel Fan-Out
Requires sequential stagesHandoff Chain
Needs centralized control and synthesisOrchestrator/Workers
Spans multiple frameworks or organizationsPeer Mesh (A2A)
Is one long chain of reasoningA single agent — don’t decompose