writeup · IOHK

How I build with AI

A custom Claude Code framework at IOHK — 7 specialized agents, 4 MCP servers, and automated PR review in the loop, documented in a public PROMPT_HISTORY.md.

Solo — design, build, infra
Claude CodeMCPCursorcustom agentsGitHub Actions

TL;DR

  • A custom Claude Code framework at IOHK: 7 specialized agents and 4 MCP servers.
  • GitHub Actions runs automated PR review inside the loop.
  • A documented PROMPT_HISTORY.md is the evidence — practices, not proprietary code.

Problem

AI coding, made repeatable.

Most AI-assisted coding is ad hoc — a good prompt one day, a hallucinated refactor the next. Inside a real engineering org that's not good enough. The goal was a workflow reliable enough to trust on production code, and legible enough that other engineers could adopt it.

Architecture

task → router agent → 7 specialized agents → 4 MCP servers (repo · docs · CI · tracker) → GitHub Actions PR review

Key decisions

Docs-as-context over bigger prompts

Chose to feed the model curated, versioned docs as context rather than pile more instructions into the prompt. Trade-off: the docs need upkeep, but grounding gets far sharper and the prompts stay short.

Deterministic checks in code over fuzzy reasoning

Chose to push everything verifiable — lint, types, tests, schema — into code, and let the model reason only where ambiguity is real. Trade-off: less "magic", but the loop is reliable and the failures are legible.

Employer-safe by design

Chose to share the practices, never the proprietary code. Trade-off: fewer concrete internals on display, but everything here is publicly shippable and honest about that boundary.

The workflow is only credible if it's documented. PROMPT_HISTORY.md is the receipt — a running log of what I asked, what the agents did, and where I overrode them. The process is the artifact.

— why PROMPT_HISTORY.md exists

Harder than expected

Keeping 7 agents from stepping on each other. Without strict context boundaries and a clear router, agents duplicated work and contradicted one another. Most of the design effort went into context discipline — who sees what, and who is allowed to decide.

Results

  • 7 agents — specialized Claude Code agents
  • 4 MCP — servers wired into the workflow
  • PR review — automated via GitHub Actions

Excerpts

# agents/reviewer.md
role: PR reviewer
context: [docs/architecture.md, docs/conventions.md]
rules:
  - block on failing types or tests (deterministic gate)
  - comment, never push — a human merges
  - cite the convention you are enforcing
// .mcp/servers.json
{
  "repo":    { "scope": "read/write", "paths": ["src/**"] },
  "docs":    { "scope": "read",       "paths": ["docs/**"] },
  "ci":      { "scope": "read",       "tool": "github-actions" },
  "tracker": { "scope": "read/write", "tool": "linear" }
}

Read the full PROMPT_HISTORY.md →

Source

View the full source on GitHub →