// Multi-agent orchestration on the Claude Agent SDK
// Scoped tools per subagent + structured error handling
const orchestrator = new Agent({
model: "claude-opus-4-6",
subagents: [researcher, reconciler, reporter],
tools: [mcpCrm, mcpWarehouse, mcpSlack],
onToolError: ({ category, retryable }) =>
retryable ? "retry" : "escalate",
});
const result = await orchestrator.run({
task: "Reconcile Q1 invoices and flag anomalies",
humanInLoop: { escalateOn: "confidence < 0.9" },
});
// { status: "complete", flags: 3, escalated: 1, confidence: 0.97 }What it is
Most agent demos run in notebooks with hardcoded inputs. Production has edge cases, malformed data, rate limits, and real users who will find every gap. We build agents that handle the full surface area — not just the happy path — on the Claude Agent SDK.
That means multi-agent orchestration with scoped subagents, MCP tool and resource interfaces that give Claude safe access to your backend systems, and structured output engineering — JSON schemas, few-shot examples, and extraction patterns that make agent responses reliable enough to act on. We pay close attention to tool interface design — clear descriptions, tight boundaries, and per-subagent tool scoping — because that's where most production agents quietly fail. Errors come back structured (retryable vs. business vs. permission) so the agent can recover locally instead of escalating every hiccup. Every agent we ship includes human-in-the-loop escalation, self-evaluation checkpoints, and a failure-mode map so your team knows what to watch for.
What you get
- 01 Agent architecture Tool inventory, model selection (Opus vs. Sonnet), subagent decomposition, and a failure-mode map for the system.
- 02 MCP interfaces Typed MCP tool and resource servers that expose your CRM, warehouse, and internal APIs to Claude with per-tool scoping and structured error responses.
- 03 Structured outputs JSON schemas, few-shot libraries, and extraction patterns tuned to your data so downstream systems can trust the response.
- 04 Human-in-the-loop Escalation thresholds, approval queues, and self-evaluation checkpoints for any action with real-world blast radius.
- 05 Subagent library Scoped subagents for research, extraction, tool-use, and review — composable across workflows, not single-use scripts.
- 06 Handoff docs Operator runbook covering monitoring, common failure modes, and the escalation path for your on-call team.
How we engage
A process designed for production.
Workflow mapping
We map the workflow you're automating, define what "working" looks like, and scope the tool surface the agent actually needs.
MCP + tool design
We design the MCP interfaces, structured-output schemas, and permission boundaries before any agent code is written.
Build against evals
Iterative development against the eval suite you'll keep forever. You see passing tests, not demo videos.
Ship with humans in the loop
Production deployment with escalation wired in, monitoring live, and a 30-day support window on your shared Slack.
Tech stack
Ready to build something that actually works?
We start every engagement with a two-week discovery sprint. No retainer required. You walk away with a spec whether you build with us or not.
Start a project →