Article Source Material: Stress-Testing the AI-First Org Design Kit on Every Inc

The Experiment

One Claude Code session, acting as Every’s CEO Dan Shipper, ran the complete 14-skill AI-First Org Design Kit end-to-end — from coordination audit to evolution auditor — producing a full organizational genome, governance framework, quality gates, workflow specifications, role definitions, agent configurations, adoption infrastructure, and evolution mechanisms.

No human intervention between phases. The agent answered all interview questions itself, drawing on research from 63 Every newsletters and articles.

By the Numbers

Metric Value
Total files produced 75
Total lines written 11,907
Articles researched 63
Research lines extracted 2,397
Subagents spawned 19
Skills invoked 14 (full kit)
Genome files 7
Governance documents 7
Quality gates 4 (with 23 holdout scenarios)
Workflow specifications 4
Roles mapped 14
Agent configurations 5
Claude Code governance skills 5
Hard boundaries defined 9
Anti-patterns documented 10
Team members assessed (maturity) 19
AGENT-PRIMER compression ratio 33:1
Errors found in audit 3 critical, 6 important, 6 minor
Errors fixed 3 critical (all fixed)

What the Kit Produced

Layer 1: Identity (The Genome)

Who Every is, encoded as decision rules. Five values — with Builder Credibility as the absolute tiebreaker that is never compromised. Voice norms that say “use ‘taste’ and ‘ship’” and “never say ‘leverage’ or ‘synergy.’” Quality standards per output type. Ten anti-patterns that define what “not us” looks like.

Layer 2: Operations (Governance + Gates)

How agents operate within Every’s boundaries. Nine hard boundaries (never publish without human review, never share client data cross-engagement, never bypass quality gates). Four quality gates with hidden holdout scenarios for validation. Escalation protocols with 2-minute decision format. A policy generation mechanism so governance grows from evidence, not committee meetings.

Layer 3: Roles + Workflows (Specifications)

How work flows through Every. Four workflow specifications that pass the Stranger Test. Fourteen roles decomposed into specification/coordination/execution time allocation. The Two-Slice Team model formalized — each GM runs their product solo with 99% AI-written code.

Layer 4: Agents + Evolution (Deployment)

How to deploy and evolve. Five agent configurations (editorial, engineering, consulting, distribution, product GM) with system prompts, tool permissions, and self-review checklists. Maturity ladder assessing 19 team members. Two adoption sprint designs. A decision ledger and monthly evolution audit cycle.

Layer 5: Operationalization (The Bridge)

AGENT-PRIMER.md — 206 lines that distill ~7,000 lines of specification into actionable operating rules any agent can load (33:1 compression). CLAUDE.md with @imports so governance auto-loads at session start. Five invokable /org-* skills for governance operations in Claude Code.

Key Insight

The hardest part of AI-first org design is not the framework — it’s the excavation. Articulating implicit organizational knowledge (“what does ‘taste’ actually mean when you have to encode it for an agent?”) is a genuinely difficult intellectual exercise. The 63-article research phase was the investment that made everything downstream credible.