05 Writing
Thoughts on AI systems, engineering leadership, and building production software.
Demystifying Claude Code: Beyond the Chat Interface
CLAUDE.md, skills, MCP servers, subagents, hooks, and agent teams — how the pieces actually work, when to use each one, and the patterns that emerge when they compose.
What I Learned Scaling a Team from 20 to 130 Engineers
From negative $50K a month to 130 engineers across 40 countries during a pandemic. The strategy that worked, the breaking points, and what happened after I left.
Building a Real-Time Voice Pipeline From Scratch
From a two-process prototype to production WebSocket voice. VAD tuning, Whisper padding tricks, and local-first inference with smart-whisper and Kokoro-82M.
GPU Inference Without the Cloud Tax
SPOT L4 instances at $0.28/hr vs managed services at $0.80/hr. Terraform automation, zero-trust networking, and container patterns for self-managed GPU inference.
What No One Tells You About the IC-to-CTO Pipeline
The standard IC-to-CTO advice is five recycled points stripped of context. Two stints, one scaling a failing org and one navigating a political minefield, taught me what the playbook leaves out.
Building Secure MCP Servers for Production
A practical guide to the architecture decisions, implementation patterns, and operational practices that separate production-grade MCP servers from the kind that end up in breach timelines.