Skip to main content

Case Study

Claude Code Toolkit: Production Infrastructure for AI Agents

How I built a production infrastructure layer for Claude Code with 13 skills, 7 agents, and a Smart MCP Loader.

PublishedJan 2026 - PresentSolo Developer
Open Source13 Skills7 AgentsSmart MCP Loader

The Problem

Claude Code ships as a blank canvas. Most setups rely on a single flat CLAUDE.md file with instructions pasted in. There is no agent coordination, no MCP server management, no memory persistence across sessions, and no scaffolding. Every new project starts from zero. Every session forgets what the last one learned.

The real friction is not writing code. It is the infrastructure around it: spinning up the right MCP servers, configuring agent permissions, maintaining context across sessions, bootstrapping new projects with consistent quality. These are the repetitive, structural tasks that eat hours before any real development begins.

What if Claude Code had a production infrastructure layer? Not a collection of prompts, but real tooling: lifecycle hooks, agent orchestration, intelligent context loading, and a scaffolding system that generates 48 files from a single interactive session.

Five-Layer Architecture

The toolkit is built on a clean five-layer architecture where each layer serves a distinct purpose and operates independently:

User Session

Skills — 13 interactive slash commands

Rules — 12 context-aware .mdc files

Hooks — Lifecycle scripts (prompt, stop, compact)

Agent Fleet — 7 specialized agents

Reports — Agent output for orchestrator review

The key architectural decision is the Director Orchestrator pattern. All agents are read-only: they analyze code and write reports to .claude/reports/, but they never modify source files. Only the Orchestrator agent commits changes. This eliminates merge conflicts when running multiple agents in parallel, a problem that plagues most multi-agent setups where agents freely write to the same files.

Smart MCP Loader

MCP servers are resource-expensive to keep running and tedious to manually enable and disable. The Smart MCP Loader solves this with a real-time NLP scoring engine that intercepts every user prompt via the UserPromptSubmit lifecycle hook.

When you type a message, the hook normalizes the text (lowercase, expand contractions, collapse whitespace), then scores it against 11 on-demand MCP server definitions. Phrase matches score +6 points, keyword matches score +2, and hard-suppress rules (noneOf) prevent false positives. For example, "sourcemap", "sitemap", and "hashmap" all suppress the Mapbox server so it does not activate on unrelated prompts.

When your session ends, a separate session-cleanup hook disables all servers that were enabled during the session. The entire lifecycle is automatic: the right server activates when you need it and cleans up when you are done.

11

On-Demand MCP Servers

The Agent Fleet

Seven specialized agents span three Claude model tiers. The Opus orchestrator coordinates everything. Sonnet agents handle deep analysis: code review, security audits, and test generation. Haiku agents run fast, lightweight tasks: health monitoring, dead code detection, and codebase search.

OrchestratorOpus

Master coordinator, sole committer

Code ReviewerSonnet

TypeScript strict mode, React patterns, performance

Security ReviewerSonnet

OWASP audit, RLS verification, secrets scanning

Test GeneratorSonnet

Test cases, coverage gaps, Ollama bulk generation

Health MonitorHaiku

Vercel, Supabase, Expo deployment health

Code SimplifierHaiku

Dead code, duplicates, complexity hotspots

ResearcherHaiku

Fast codebase search and exploration

7

Specialized Agents

13 Interactive Skills

Every skill is a slash command that activates directly in Claude Code. They cover the full development lifecycle: from bootstrapping a new project with 48 template files, to evaluating skill quality across 8 weighted dimensions, to running tiered research that escalates from web search through Perplexity deep research.

/new-project6-phase interactive bootstrapper with 48 templates
/skill-eval8-dimension quality scoring with anti-pattern detection
/trackTrack-based development with TDD checkpoints
/research5-tier search escalation from WebSearch to Perplexity Research
/debriefCollaborative session analysis with memory persistence
/remote-sessionMobile-safe mode with plan-first workflow

Skills use progressive disclosure to manage context window budget. Heavy skills split into core instructions (always loaded) and resource sections (loaded on demand), saving hundreds of lines of context for normal coding sessions.

13

Interactive Skills

Memory and Intelligence

Knowledge persists across sessions through a three-tier memory cascade. ChromaDB provides semantic search using Ollama embeddings (nomic-embed-text). A MEMORY.md file serves as an instant-access cache with a 200-line limit and auto-archiving. Agent reports in .claude/reports/ provide the third tier.

The system degrades gracefully. If ChromaDB is unavailable, the MEMORY.md cache still works. If both are down, agent reports persist in the filesystem. The /debrief skill runs a collaborative session analysis where Claude proposes memory notes and the user reviews them via multi-select before saving.

Duplicate detection prevents memory bloat: before inserting into ChromaDB, the system checks embedding distance and skips entries that are too similar (distance < 0.3) to existing memories.

Open Source

The entire toolkit is MIT licensed and publicly available. It was built entirely with Claude Code (Opus 4.6), making it a meta-recursive demonstration: a Claude Code toolkit built by Claude Code, showcasing the very patterns it teaches.

Installation uses a symlink-based approach. Running ./scripts/install.sh creates symlinks from the cloned repository into ~/.claude/, so pulling updates from GitHub propagates changes automatically without reinstallation.

Tech Stack

Backend

Node.js (ESM)Supabase

AI

Claude Code CLIMCP ProtocolChromaDBOllama

Infrastructure

DockerShell ScriptingVercel

Frontend

TypeScript

Tooling

GitHub Actionsn8n

Want to discuss the architecture?

Get in Touch
Claude Code Toolkit: Production Infrastructure for AI Agents — Jeff Michael Johnson