AgentShield Battle is a real-time multi-agent cybersecurity simulation where 6 autonomous AI agents simultaneously attack and defend a vulnerable banking API — no human intervention required. The problem: Financial APIs face constant automated attacks. Traditional security teams are reactive, slow, and can't match the speed of modern attack toolkits. The solution: A coordinated swarm of AI agents that mirrors a real Security Operations Center. Three red team agents — Red Attacker, White Box, and Chaos QA — execute real OWASP vulnerabilities (BOLA, JWT bypass, race conditions) against a live banking API. Three blue team agents — Blue Defender, Purple Arbiter, and State Manager — detect, reason, and respond in milliseconds using LLM-powered decision making. What makes it different: Every defense decision is made by an LLM (Qwen 72B via Featherless AI) reasoning about the attack in real time — not hardcoded rules. Purple Arbiter uses Claude Sonnet with Extended Thinking to narrate the battle, run a Planning Poker severity vote across all 6 agents, and generate security patches. All agents coordinate via a pub/sub message bus that mirrors the Band AI room model, and post live updates to a Band AI room visible at app.band.ai/dashboard. The result: A live SOC Dashboard showing $1M in capital under attack, a resilience score updating in real time, an attack heatmap, and 5 hardening flags activating autonomously — the entire cycle from attack to defense in under 2 minutes. Stack: Claude Sonnet 4.6 · Qwen2.5-72B · Band AI · FastAPI · React · Google Cloud Run · Firebase Hosting
Category tags: