THE PROBLEM AI applications are exploding. Every LLM call is a security boundary. OWASP recently published the Top 10 for LLM apps (2025 v2) — prompt injection, insecure output handling, system prompt leakage, supply chain risks, excessive agency. But there's zero tooling that automatically converts security findings into production-ready policies. Today, security teams manually translate scan results into firewall configs — slow, error-prone, and ungoverned. THE SOLUTION FORGE is an end-to-end pipeline built with IBM Bob: 1. Scanner — detects LLM call-sites (OpenAI, Anthropic, LangChain, LlamaIndex) and maps each to OWASP categories LLM01-LLM10, including LLM07 System Prompt Leakage. Pattern detection for prompt injection (f-strings, format strings), credential leaks, missing output validation, agentic overreach (subprocess in agent loops), supply chain gaps, and hardcoded system prompts. 2. Policy Generator — converts findings into Lobster Trap-compatible YAML policies with ingress rules, egress allowlists, rate limits, and filesystem sandboxing. Each OWASP category gets specific countermeasures. 3. BobShell — tamper-evident SHA-256 hash-chained audit trail. Every action is cryptographically linked to the previous, making policy generation a compliance-grade artifact for SOC 2, ISO 27001, GDPR, and HIPAA reviews. BUILT WITH IBM BOB 3 productive Bob IDE tasks (mandatory per guide) + 5 Bob Shell sessions. ~27 of 40 Bobcoins used. Bob produced: LLM07 detection (5 patterns + 5 tests), 717-line ONBOARDING.md contributor guide, 595-line RISK_REGISTER.md security self-audit, 1024-line architecture doc, 95 unit tests, security hardening, 60KB of documentation. All Bob IDE markdown exports + consumption summary screenshots in bob_sessions/. VERIFIED 95/95 tests pass in 1.2s. Demo finds 15 vulnerabilities across 7 OWASP categories in 27ms. Benchmark: 95,000 lines/second scan throughput. Production-grade. MIT-licensed.
Category tags: