Orchestrating parallel agents without Durable Functions
Multi-agent systems don't need a heavyweight orchestration runtime to be reliable. What they need is routing, parallel execution, and a scoring discipline. The pattern, and where it ends.
The moment a system grows past one AI agent, the instinct is to reach for an orchestration runtime — something durable, stateful, with a workflow engine. Sometimes that instinct is right. More often it answers a question nobody asked, and the real problem goes untouched.
The real problem is decomposition
A single agent asked to do a large, multi-faceted job produces confident mush. Not because the model is weak — because the job was never one job. "Analyze this" is really five questions wearing a trench coat: each with its own method, its own data, its own idea of what a good answer looks like.
The first move in any multi-agent system is honest decomposition. Name the independent dimensions. Give each one an agent that does that one thing well. The quality of the whole system is set here, before a line of orchestration code exists.
Three things you actually need
Once the work is decomposed, reliable multi-agent behavior comes from three capabilities — none of which requires a durable workflow engine.
Routing. Not every request needs every agent. A small routing layer reads the request and dispatches it to the skills that actually apply. This keeps cost and latency proportional to the question.
Parallel execution. The independent agents are, by construction, independent — so run them at the same time. The wall-clock cost of the system becomes the cost of its slowest agent, not the sum of all of them.
A scoring discipline. Five agents return five answers. If you stop there, you have opinions, not a decision. A synthesis step combines them into a composite result on one scale, with weights you wrote down and can defend. This is the part teams skip, and it is the part that makes the output trustworthy.
Multi-agent reliability is routing, parallel execution, and a scoring discipline. The workflow engine is optional. The scoring discipline is not.
Skills that compose
Build each agent as a skill that works two ways: standalone, callable on its own, and as a participant in an orchestrated run. When skills compose like this, growing the system means adding a skill — not rewiring the orchestrator. A system that started with five capabilities reaches fifteen without a rewrite, because nothing about adding the sixth touched the other five.
Where this ends
Be honest about the boundary. This lightweight pattern is the right answer when the work fits inside a request: fan out, run in parallel, synthesize, return. It is the wrong answer when you genuinely need durability — workflows that run for days, that pause for human approval and resume next week, that must guarantee exactly-once execution across failures. That is what a durable orchestration runtime is for, and when you need it, use it.
Most multi-agent systems are not that. They are a hard question that decomposes into parallel parts and needs to come back as one scored answer. For those — and there are a lot of them — routing, parallelism, and a scoring discipline are the whole architecture.