
The Problem: $743B in e-commerce returns annually and most low-value damage claims still take 3 days of human back-and-forth. Existing AI agents (Sierra, Decagon, etc) automate the customer dialogue. ClaimsForge is the Trust Layer that compliance and merchant ops teams have been waiting for. How It Works: Seven Gemini 2.5 specialist agents run as a parallel asyncio pipeline: IntentAgent → (Emotion ‖ Needs ‖ DamageAgent) → CompensationAgent → SupervisorAgent → VerifierAgent. Customer uploads a damaged item; DamageAgent (Gemini 2.5 Vision) returns bounding-box-localized severity + detected_subject. CompensationAgent picks from 26 policies + 1,329-entry hybrid-RAG KB (Gemini embedding + keyword). Then an AWS-IAM-style Supervisor enforces deterministic hard rules in pure Python: pHash visual-replay gate, EXIF age check, multimodal text/image consistency, duplicate-order detection, $500 cash cap, perishable exemption. Every reply ships with a Stripe-Radar-style Trust Score, 6 weighted factors (image uniqueness, image provenance, amount sandbox, history coherence, emotion gating, evidence quality), each with a rule_id backlink to the supervisor decision that drove it. What Makes It Different • 100% Auditable Decisions — every rule is pure Python, every match logged with rule_id (no opacity) • Deepfake-Aware — pHash collision detection + EXIF DateTimeOriginal sanity check + multimodal text/image consistency gate (99% of insurers have seen AI-tampered evidence; only 32% feel confident catching it) • Quick Self-Deploy — MIT-licensed, single Vultr VM, no managed-service contract (Sierra typical deploy is $200K-2M ARR) • Self-Evolving — 47 methodologies auto-distilled by Gemini from resolved cases • Tier-2 Hot-Editable Rules — ops team toggles "Liquid on Electronics escalates" via admin UI, no code deploy • Data Flywheel — every approved claim auto-labeled gold/normal/red_flag, one-click SFT export to Vertex AI / OpenAI / Anthropic fine-tuning formats
19 May 2026