Production incidents are expensive twice. Once in downtime. Once in the hours engineers spend chasing the wrong lead before finding the real one. PostMortem.ai eliminates the second cost. Feed it an incident ID. Six specialist agents run in sequence, each on a different model sized for its job. Every reasoning step streams live to a terminal dashboard. You get a complete post-mortem with zero human intervention. HypothesisAgent on Llama-3.1-70b (Vultr Serverless Inference) generates investigation leads. EvidenceAgent on Llama-3.1-8b evaluates each tool result in a tight loop. RootCauseAgent on Llama-3.1-70b synthesizes confirmed evidence. CriticAgent on Gemini (gemini-2.0-flash, Google) red-teams the conclusion from a completely independent provider - falling back to Qwen-32b if Gemini quotas are exhausted. ReportAgent on Llama-3.1-70b writes the final document. VisionAgent on Gemini 2.5 Flash reads dashboard screenshots before text reasoning begins. The CriticAgent uses a different model family AND a different provider specifically so it cannot self-validate. Llama reasons on Vultr. Gemini challenges it from Google. Correlated reasoning errors become structurally impossible. The investigation is rejection-first by design. Hypothesis 1 is always a red herring. The agent must reject it with evidence before proceeding, forcing visible reasoning instead of shortcuts to the answer. Every tool call hits a real HTTP endpoint. Tool calls are observable in server access logs. Five incident scenarios from $21K to $312K. PagerDuty webhook triggers fully autonomous investigations. Deployed on Vultr with Docker Compose and SSE-tuned nginx.
Category tags: