Session 02

Agent Coordination.

One agent is useful. Multiple agents working together is a system. Today we go from single-agent tool use to delegation, handoffs, and parallel dispatch.

office-hours.dev · 1900 Broadway, 2nd Floor · Oakland, CA

Agenda

How the next 3 hours work

12:00

Welcome + pulse check

Returning faces, new faces. What shipped from Session 01. Quick show of hands on multi-agent experience.

12:10

From one agent to many

Why a single agent breaks at scale: context limits, mixed concerns, compounding errors. When delegation pays for itself.

12:25

Core patterns: orchestrator, worker, router, critic

The four shapes most multi-agent systems collapse into. Which to reach for, and when.

12:40

Live demo: PR Review Crew

One Sonnet orchestrator dispatches three Haiku reviewers in parallel — security, style, correctness. Fan out, fan in, aggregate. Clone the repo and follow along.

12:55

Choose your path

Pick a build project based on your skill level and what you care about. We help you scope it.

1:00

Build session (2 hours)

You build. We float. Ask anything. Change direction. Go deeper. The room is yours.

2:45

Show + tell (optional)

Quick demos of what people built. 2 minutes each. No pressure, but encouraged.

Pulse Check

Who's in the room?

Quick show of hands. No wrong answers.

01

Built an agent that uses tools?

Tool definitions, function calling, the basic loop from Session 01.

02

Had one agent call another?

Parent agent dispatches a child with a scoped task and a fresh context.

03

Run agents in parallel?

Fan-out to multiple workers, fan back in with aggregated results.

04

Shipped a multi-agent system to prod?

Running in the real world, handling real traffic, recovering from real failures.

Terms we'll use

A quick shared vocabulary

We'll use these words a lot in the next three hours. If any of them feel fuzzy, come back to this slide.

agent

A language model plus tools it can call. From Session 01. The base unit of everything we build today.

tool

A function the model can invoke, declared with a name, description, and typed input schema. The model requests, your code executes.

context window

Everything a model sees on one call — system prompt, messages, tool results. Each subagent gets its own fresh one.

subagent

An agent invoked by another agent, usually with a scoped task and isolated context. Exposed as a tool to its parent.

orchestrator

The parent agent that plans, dispatches subagents, and aggregates their outputs. Does no domain work itself — only routing.

handoff

The structured data passed when one agent invokes another. Typed inputs in, typed outputs out. The contract between agents.

fan-out / fan-in

Dispatch N independent subagents in parallel (fan-out), collect all N results (fan-in), then aggregate. The PR Review Crew pattern.

runtime

The software that runs the agent loop for you — message plumbing, tool dispatch, retries, state. Examples: Claude Code, Claude Agent SDK, OpenAI Agents SDK, LangGraph.

The Shift

Single agent vs. agent system

Single agent

One context window holds everything.

One role, doing all the jobs.

Sequential: think, act, think, act.

One failure cascades into total failure.

Hard to specialize. Prompts bloat.

Agent system

Isolated contexts, scoped to each subtask.

Specialists with narrow responsibilities.

Parallel where independent, sequential where ordered.

One subagent fails; the supervisor retries or routes around it.

Each agent is small, readable, testable.

The key insight: An agent system is just an agent that treats other agents as tools. Same tool-calling API surface from Session 01 — pointed inward. You already know how to build this.

The Building Blocks

Three things that make a multi-agent system work

01

Delegation

The parent agent invokes a child with a scoped task and a fresh context. The child doesn't see the parent's history — only what it needs.

02

Handoff protocol

Structured inputs and outputs between agents. Pydantic, JSON Schema, typed messages — anything that turns "text in, text out" into a real contract.

03

The supervisor loop

Who decides when a subtask is done, what to retry, when to escalate. This is the code the orchestrator owns and subagents don't touch.

Patterns

Types of orchestration

Four shapes most multi-agent systems collapse into. Each has its own strengths and failure modes. Real systems mix and match.

Orchestrator-Worker

One planner dispatches specialists in parallel and aggregates their outputs. Hub and spokes. Fan-out, fan-in.

Use when: tasks cleanly split into independent subtasks.

Pipeline

Each agent transforms the input and hands off to the next. Sequential, typed stages. The inbox triage pattern.

Use when: work has a natural order — classify → act → confirm.

Router

A classifier decides which specialist gets the task. Only one path executes. Cheap, deterministic dispatch.

Use when: input types are heterogeneous but each needs one specialist.

Critic Loop

An executor produces output; a critic reviews and sends feedback. Repeat until the critic accepts. Quality over latency.

Use when: correctness matters more than speed — code, long-form writing, plans.

The PR Review Crew you'll see in the demo is Orchestrator-Worker. Once you see one pattern clearly, the others are small variations on the same mental model.

The Platform

Subagents & Agent SDKs

Four runtimes — software that runs the agent loop for you, handling message plumbing, tool dispatch, parallel calls, retries, and state. Same mental model across all of them; pick based on how much control you need.

Declarative (no code)

Claude Code subagents. Define an agent in markdown, give it tools, reference it from a parent. The CLI handles the loop.

Programmatic (code)

Claude Agent SDK. OpenAI Agents SDK. LangGraph. You define agents, tools, and handoffs in code. The SDK runs the loop with typed messages.

Why this matters today: You don't have to build the orchestrator loop from scratch. Pick a runtime that matches your control-vs-ergonomics preference, wire in your tools and prompts, and you're coordinating agents in an afternoon.

How It Works

Anatomy of a delegation

Same four-step shape as a tool call from Session 01. The tool just happens to be another agent.

// 1. Expose the subagent as a tool
{
  "name": "security_review",
  "description": "Reviews code for security issues: auth, secrets, crypto.",
  "input_schema": { "diff": { "type": "string" } }
}

// 2. Orchestrator (Sonnet) decides which subagents to dispatch
→ tool_use: security_review({ diff })
→ tool_use: style_review({ diff })
→ tool_use: correctness_review({ diff })

// 3. Dispatch in parallel — each subagent runs on Haiku
→ asyncio.gather(security.run, style.run, correctness.run)
→ each returns a typed SubagentReview

// 4. Orchestrator aggregates into a single verdict
→ { top_priorities: [...], verdict: "request_changes" }

The orchestrator never reviews code directly. It plans, routes, and aggregates. Specialists do the work. Each runs in its own context, sees only what it needs, and returns structured output.

Live Demo

PR Review Crew

A working reference implementation of the orchestrator-worker pattern. One Sonnet orchestrator. Three Haiku reviewers. Parallel dispatch, structured handoffs, per-run cost tracking.

git clone && make dev → works on first try, $0 spend

aiofficehours/pr-review

Clone the repo, flip MOCK=true to see the orchestration flow end-to-end with no API calls. Flip it to false and the cost tracker prints Sonnet + Haiku spend for every run — plus what the same work would cost all-Sonnet or all-Opus.

github.com/aiofficehours/pr-review →

What we'll walk through

The SUBAGENT_TOOLS declaration. The parallel dispatch. The schemas.py handoff contract. The cost report.

What's yours to extend

Subagent system prompts, the aggregation logic, and a fourth specialist of your choice — performance, accessibility, whatever matters to your team.

Approaches

Three ways to orchestrate today

A

Claude Code subagents

Declarative. Markdown files define subagents. Parent agent references them. No orchestrator code to write. Fastest path from idea to working system.

Best for: dev workflows, research crews

B

Agent SDK

Claude Agent SDK or OpenAI Agents SDK. Define agents and typed handoffs in code. SDK runs the loop, handles retries, tracks state. Good middle ground.

Best for: product features, APIs

C

Custom loop / graph framework

LangGraph or your own orchestrator. Full control of routing, state, and retries. More code, more flexibility. The PR Review Crew lives here.

Best for: production systems with specific needs

Build Session

Choose your path

Pick one. Build it in the next 2 hours. We're here to help.

New to multi-agent

Research team

One orchestrator plus three researcher subagents, each looking at a different angle of the same topic. Aggregate into a single report.

Use Claude Code subagents (no code)
Define 3 specialist prompts
Parent agent aggregates findings
Run it on a real topic you care about

Tools: Claude Code · No code required

New to multi-agent

Daily briefing, upgraded

Take your Session 01 briefing agent and split it into parallel specialists — news, weather, calendar, markets — each writing its own section.

One subagent per information source
Run them in parallel
Orchestrator stitches the final briefing
Compare latency vs. the S01 version

Tools: Claude Desktop + MCP · Light config

Some experience

PR review crew (extend ours)

Clone aiofficehours/pr-review. Tune the three subagent prompts for a codebase you know. Add a fourth specialist — performance, accessibility, tests, your call.

Fill in the TODO(you): prompts
Add a fourth tool to SUBAGENT_TOOLS
Improve the aggregation / prioritization
Run it on a real diff from your repo

Tools: Python + Anthropic SDK · Some coding

Some experience

Inbox triage team

Chain three agents: classifier, responder, escalator. Each hands off to the next with structured context. Point it at a real inbox.

Classifier: urgency, topic, needs-reply
Responder: drafts reply for review
Escalator: flags humans-only items
Handoffs via Pydantic schemas

Tools: Agent SDK + Gmail MCP · Some coding

Ready to build

Custom orchestrator via Agent SDK

Build your own orchestrator with the Claude Agent SDK or OpenAI Agents SDK. Typed handoffs, parallel dispatch, retries. The full loop, your way.

Define 3+ specialist agents with typed I/O
Implement a router / planner agent
Parallel dispatch with result aggregation
Graceful retry on subagent failure

Tools: Agent SDK + your editor · Full code

Ready to build

Multi-agent MCP server

Build an MCP server that exposes an internal agent team as a single tool. Any MCP client calls review_pr and the orchestrator fans out under the hood.

Wrap the PR Review Crew in an MCP server
Expose a single tool per workflow
Hide orchestration complexity from the client
Test with Claude Desktop

Tools: Python or Node + MCP SDK · Full code

🃏 Wildcard

Plan your trip with agents

No skill gate. Pick a real trip you actually want to take. Wire up agents that coordinate across flights, hotels, calendar, weather, maps, and your budget — then let them hand off to each other. Start in Claude Desktop with MCP servers; graduate to the PR Review Crew pattern if you want parallel dispatch and a supervisor loop. Bring back a plan you'd actually book.

Pick your trip. Real dates, real budget, real constraints.
Define specialist agents: flights, lodging, itinerary, logistics
Let them coordinate via structured handoffs
Exercise every concept in the session on a problem that has stakes

Tools: your call · Any skill level

API Keys

Need API access? Use Keyfree.

If your project needs API keys (Anthropic, OpenAI, Google, etc.), we'll provision them on-site through Keyfree. No credential management required. You use the capability without seeing the key.

keyfree.dev

Resources

References for the build session

Session 02 reference repo

github.com/aiofficehours/pr-review

Clone it, make dev, read the orchestrator

Anthropic

docs.anthropic.com/en/docs/agents-and-tools

Claude Agent SDK · Claude Code subagents

Blog: Building effective agents

OpenAI & others

OpenAI Agents SDK

LangGraph (graph-based orchestration)

MCP servers: mcp.so, github.com/modelcontextprotocol/servers

This session

office-hours.dev/session02

Recording posted after the session

2:45 PM

Show + tell

Built something? Show the room. 2 minutes each. No slides needed. Just share your screen and tell us what you made.

Not required. But you'll be surprised how much you built in 2 hours.

What's Next

Session 03: Trust, Safety, and Letting Agents Spend Money

What happens when agents take irreversible actions? Build an agent that makes a real transaction, inspect the receipt chain, and see what happens when something goes wrong. May 7, same time.

Coming up

May 7 · Trust, Safety & Letting Agents Spend Money

May 21 · Debugging Agents in Production

June 4 · Agents Meet the Physical World

Stay connected

office-hours.dev

hlos.ai

day-zero.dev