Crucible is an adversarial document-hardening desk for the documents you can't afford to get wrong — incident runbooks, security policies, compliance procedures, contracts. Most AI tools stop at "Find": they list problems and hand them back. Crucible closes the loop — Find → Fix → Prove. Three AI agents work against each other, coordinated through Band. A Red agent attacks the document like an adversary and surfaces concrete flaws. A Blue agent defends, rewriting the document to close each gap. An Arbiter then judges the new version and scores it from 0 to 100. The loop repeats for up to three rounds, and on high-stakes documents it escalates to a human for final sign-off regardless of score. Every action — each attack, fix, score, and approval — is appended to a tamper-evident SHA-256 hash chain, so the whole hardening history can be independently verified afterward. In our live run on a Kubernetes production-rollback runbook, the agents surfaced 14 findings (3 Critical, 5 High, 5 Medium, 1 Low) across three rounds, with the Arbiter scoring 75 → 75 → 85. The run used all three rounds and — with some findings still partially open and a human gate required for high-stakes work — escalated to a human on-call engineer, who reviewed and approved the final version. The result was sealed into a 67-entry hash chain, fully reproducible at /api/verify. Crucible runs all three agents on Featherless (Qwen2.5-7B-Instruct) and coordinates them on Band, with a Next.js front end on Vercel and Supabase for state and realtime. It's built for regulated, high-stakes workflows where "looks fine" isn't enough — you need proof that a document was stress-tested before it ever met reality.
Category tags: