One Loop, Three Patterns: A Practical Guide to Multi-Agent Orchestration

If you’ve spent any time with AI coding tools or chat-based assistants, you’ve probably noticed a pattern. The conversation starts sharp. The AI understands the task, makes good suggestions, executes cleanly. But somewhere around the ten back-n-forth mark — after a dozen tool calls, a few dead ends, and a growing pile of intermediate results — things get muddy. The AI starts repeating itself. It forgets a decision you made ten minutes ago. It re-reads a file it already summarized. The context window, that finite scratchpad the model uses to track everything, is full of noise from work that no longer matters.

The instinct is to blame the model. But the problem isn’t intelligence — it’s architecture. You’re asking one agent to hold an entire project in its head while simultaneously doing the work. That’s like asking a project manager to write every line of code, run every test, and keep the project plan updated, all in a single unbroken train of thought.

The alternative is to stop treating the AI as one agent and start treating it as several. A parent agent that plans and delegates. Child agents that receive clean, focused prompts, do one job, and return concise results. The parent never sees the mess — it sees summaries.

This is multi-agent orchestration. And while the idea sounds complex, the underlying mechanics are surprisingly simple. Every agent — whether it’s the planner, the worker, or the reviewer — runs the same loop.

The loop that runs everything

Here’s the part that reframes the whole design space: there is no special “coordinator runtime” or “worker runtime.” Every agent, regardless of its role, executes the same cycle.

It receives a prompt. It sends that prompt (along with its available tools) to a language model. The model responds — sometimes with text, sometimes with a request to use a tool. If it asked for a tool, the agent executes that tool and feeds the result back. Then the model responds again. This continues until the agent decides it’s done, or it hits a turn limit.

That’s it. A coordinator agent runs this loop. A worker agent runs this loop. A reviewer agent runs this loop. What makes them different isn’t the loop — it’s the configuration. Which model powers it. Which tools it can access. What its system prompt says. How many turns it gets before it’s cut off.

This is worth sitting with for a moment, because it changes how you think about building these systems. You don’t need a “coordinator framework” and a “worker framework.” You don’t need different execution engines for different agent roles. You need one loop and a configuration surface wide enough to express the differences between roles. The orchestration patterns — and there are really only three you need — emerge from how you configure that single primitive.

It also means that adding a new type of agent is not an infrastructure problem anymore. You aren’t writing new execution logic. You’re writing a new system prompt, choosing a different tool set, and perhaps adjusting the turn limit. The inherent loop stays the same. The orchestration patterns emerge from configuration — not from fundamentally different execution models. The way you think about creating and maintaining agent, changes.

Pattern one: Delegation

The simplest pattern. A parent agent hits a sub-task it shouldn’t handle itself — maybe it needs to search a codebase for all API endpoints, or read through a set of test files and summarize coverage. Instead of doing it inline (and polluting its own context with all that raw output), it writes a self-contained prompt describing the task and hands it to a fresh agent.

The child agent starts clean. No history, no accumulated context, no awareness of what the parent has been doing. It gets a prompt, access to required tools, it does the work and returns a result. That result lands back in the parent’s context as a concise summary, not as fifty lines of grep output.

The key property here is prompt self-containment. The child can’t see anything the parent sees. Everything it needs must be in the prompt. This feels like a limitation, but it’s actually a forcing function for clarity — if the parent can’t articulate the task well enough for it to stand on its own, the decomposition isn’t clean yet. Vague delegation produces vague results, and the self-containment requirement makes vague delegation obvious immediately rather than subtly.

Delegation comes in two flavors.

Synchronous: the parent blocks and waits for the result before continuing.
Asynchronous: the parent gets back an ID immediately and checks in later.

The synchronous flavor is simpler — the sub-agent’s output arrives as a tool result, and the parent picks up where it left off. The async flavor matters when you want to fire off several delegations without waiting for each one sequentially, which leads directly to the second pattern.

Here comes a Crucial Question: When do we know a task can be delegated? Answer: When a focused, well-scoped sub-task can be created, where one agent has one objective, and is self-sufficient. “Read all the test files and summarize coverage.” “Search the codebase for all API endpoints.” Tasks where the answer is “go do this one thing and report back.” The delegated Agent does all work, and comes back with just the response. Your main agent’s context is not polluted unnecessarily.

Pattern two: The swarm

Delegation handles one sub-task at a time. But some problems decompose into several independent pieces that could all run simultaneously. You need three modules reviewed. You need competitive analysis from four different angles. You need a security audit, a performance check, and a documentation review — and none of them depend on each other.

A swarm is a single operation that fans out to multiple agents and fans back in to a combined result. You define a team: a shared task description, plus a list of agents, each with a name, a role, and optionally a customized prompt, with a set of required tools. All agents launch at once. They run in parallel. When they’re all done, their outputs are collected and returned as one structured result — each agent’s name paired with its findings.

The distinction between a swarm and several async delegations might seem subtle, but it matters in practice. With individual async delegations, the parent manages each agent separately — launching them, tracking their IDs, polling for completion, aggregating results manually. A swarm handles the fan-out and fan-in as a single atomic operation. You define the team, the system executes all of them, and you get back one unified result. The bookkeeping is handled for you.

The important nuance is that not every agent in the swarm should be identical. One might have a security-focused system prompt and restricted tools. Another might be tuned for performance analysis. A third checks documentation completeness. Same shared task, different lenses. The configuration-driven nature of the universal loop makes this natural — you’re not building three different agent types, you’re writing three different system prompts and tool configurations over the same loop.

This is where the “One Loop- Different Config” insight pays its first real dividend. If agents were fundamentally different things, building a swarm with mixed capabilities would mean wiring together different systems. But because they’re all the same loop with different configs, a swarm is just a list of configurations launched in parallel.

Swarms shine when the work is genuinely parallel and independent. They don’t help when tasks have sequential dependencies — when agent B needs agent A’s output before it can start. For that, you need the third pattern.

Pattern three: The coordinator

This is the pattern for complex, multi-phase work. Think “refactor the authentication system” — a task that requires understanding the current state, planning the changes, implementing across multiple files, and then verifying nothing broke. No single delegation covers it. A swarm can’t handle it because the phases are sequential and each one depends on what the previous phase found.

A coordinator is an agent whose entire job is to plan, delegate, collect, synthesize, and iterate. It never does the implementation work itself. It spawns worker agents to gather information, reads their findings, identifies gaps, spawns more workers to fill those gaps, delegates implementation tasks, and then spawns verification workers to check the results.

The coordinator follows a natural four-phase rhythm: research, synthesis, implementation, verification. In the research phase, it spawns workers in parallel to gather information — reading files, searching code, collecting data. In synthesis, it collects those results, identifies what’s missing, and merges findings into a coherent picture. In implementation, it delegates the actual changes to specialized workers. And in verification, it spawns a final round of workers to check that the changes are correct — running tests, reviewing diffs, validating behavior.

This rhythm isn’t enforced by a rigid state machine. It’s a prompt-level pattern — the coordinator’s system prompt tells it to work in these phases, and because language models follow instructions well, it does. The coordinator can adapt: if the research phase reveals the problem is smaller than expected, it might skip straight to implementation. If verification fails, it loops back to synthesis.

But there’s one thing about the coordinator that is enforced structurally, not by prompt. And it’s the part of this architecture that matters most.

The real enforcement: Tool partitioning

Here’s the problem that breaks most naive multi-agent systems. If every agent has access to every tool, there’s no structural reason for a coordinator to delegate instead of just reading files and running commands itself. And there’s nothing stopping a worker from spawning its own sub-agents, which spawn their own sub-agents, in an uncontrolled cascade that burns through your API budget exponentially.

The fix isn’t telling agents what they shouldn’t do. It’s removing the capability entirely.

In a well-designed system, tools are split into three categories. Orchestration tools — spawning agents, sending messages between them, cancelling running agents — are available only to the coordinator. Execution tools — shell commands, file operations, search — are available only to workers. A small set of shared tools, like task tracking, are available to both.

This creates two hard guarantees. The coordinator cannot read a file or run a command, because those tools don’t exist in its world. If it needs information, it must delegate — which forces it to articulate what it needs in natural language. And workers cannot spawn sub-agents, because the spawn tool isn’t in their toolset. No amount of creative prompting bypasses this, because you can’t call a tool that isn’t there.

Without these boundaries, you get two predictable failure modes.

The first is the busy-manager problem. The coordinator starts doing the work itself — reading files, running grep, scrolling through test output. It gets buried in implementation details, burns through its context window with raw data, and produces worse results than a simple single agent would have. The irony is that a coordinator that “helps” by doing work directly is less effective than one that’s forced to delegate, because the act of delegating forces clear task articulation, and clear task articulation produces better results.

The second is infinite delegation. A worker spawns a sub-worker, which spawns a sub-sub-worker. Each level incurs API costs and context window usage. Without limits, this compounds exponentially. The original task never completes because every agent keeps delegating downward instead of actually doing work. It’s recursion without a base case.

Tool partitioning makes both structurally impossible. And that word — structurally — is the point. When you rely on prompts to constrain behavior, you’re relying on model judgment under pressure. Long context, ambiguous tasks, accumulated noise — these are exactly the conditions where model judgment degrades. The agents that misbehave most are the ones that have been running longest and have the most cluttered context. A prompt that says “don’t read files directly” works fine at turn three. By turn forty, the model has forgotten or rationalized past it.

Removing the tool is a guarantee that holds regardless of how confused, tired, or creative the model gets. It’s the difference between a safety guideline and a guardrail.

Choosing the right pattern

The three patterns form a complexity gradient, and the meta-principle is to use the simplest one that works.

If the task is focused and well-scoped — “search the codebase for all uses of this deprecated function” — delegation is enough. One agent, one job, clean result.

If the task decomposes into independent parallel work — “review these three modules for security issues” — a swarm handles the fan-out and fan-in in a single operation.

If the task requires planning, iteration, and multi-phase execution — “refactor this subsystem and verify the tests still pass” — the coordinator pattern earns its overhead.

And if the task can be answered directly by a single agent in a few turns, don’t orchestrate at all. Every pattern adds API cost, latency, and complexity. Orchestration pays for itself when the task genuinely benefits from parallelism, context isolation, or hierarchical planning. It doesn’t pay for tasks one focused agent could handle in five turns. The most common mistake in multi-agent systems isn’t using the wrong pattern — it’s orchestrating when you didn’t need to.

The underlying insight is that the power of multi-agent systems doesn’t come from any individual agent being smarter. It comes from composition. The same model, running the same loop, configured differently and coordinated through clear boundaries, can tackle problems that no single invocation could handle. The engineering challenge isn’t in the agents — it’s in the orchestration. The boundaries, the tool partitioning, the configuration. Get those right and the agents almost take care of themselves.

One Loop, Three Patterns: A Practical Guide to Multi-Agent Orchestration was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.