
AgentMarshal is a governance and audit-evidence layer for autonomous AI agents — not a screener, not a monitor. It sits in front of an agent fleet, evaluates every proposed action against a policy contract, and produces a cryptographic receipt for the decision. Agents now decide at machine speed on things that touch money, risk, and compliance. When one goes wrong, the question is: show me, cryptographically, exactly what the agent knew and decided at the moment it decided. Logs can be edited, backfilled, or silently stop. Reconstruction isn't proof. Every decision and every refusal becomes a signed audit record: Ed25519-signed, JCS-canonicalized (RFC 8785), RFC 3161 timestamped by an independent authority (FreeTSA), and hash-chained to prior records. A one-byte edit flips verification from valid to invalid. Anyone can verify a receipt against our published key and the timestamp authority's certificate, without trusting us. Cedar and OPA do structural authorization well; AgentMarshal matches them and doesn't claim to beat them there. The difference is the data model: a signed, externally-anchored, tamper-evident audit artifact doesn't exist in their world. Bring your own policy engine; AgentMarshal layers the why, the when, and the evidence on top. The demo is a financial-crime screening desk. Hit Run sequence: agents propose trades and AgentMarshal evaluates each in under a second. A trade with a sanctioned counterparty is DENIED with a signed, freshly-timestamped receipt. Screening a counterparty fires real Bright Data calls — a SERP search and three Crawl scrapes — each checked against policy before it runs and recorded with its response hash inside the receipt. The reasoning runs on the AI/ML API and is sealed into the record. Edit one byte of any receipt and verification turns red. Logs aren't evidence. AgentMarshal gives you the receipt that proves what your agent knew when it decided.
31 May 2026

AgentMarshal is a governance layer for autonomous AI agent fleets, built on Veea's Lobster Trap as a constitutive dependency. THE PROBLEM Businesses across industries are deploying autonomous AI agents with credentials, inboxes, and corporate cards. The capability landed before the governance layer did. One manipulated prompt can authorize a wire transfer, leak customer data, or commit a business to a contract the owner never approved. Real governance is four-dimensional: intent, vendor, category, cumulative spend. THE ARCHITECTURE Lobster Trap inspects every prompt with DPI — flagging injection patterns, obfuscation, computing a risk score. AgentMarshal consumes those signals and layers policy primitives on top: declared scope vs. detected intent, per-agent budgets, vendor allowlists, margin floors, approval thresholds. Every decision writes a full audit row. THE DEMO Cortez Roofing — 5-agent fleet in Phoenix. 🟢 GREEN — routine invoice approved. 🟡 YELLOW — $14,800 quote at 28% margin against 35% floor. Escalates to Mike. One-click approval. 🔴 RED — Comms Agent receives spoofed invoice from [email protected] with embedded prompt injection demanding new ACH routing. Lobster Trap fires risk_score 0.83 + injection + obfuscation. AgentMarshal blocks via block_prompt_injection. $12,000 attack blocked. Roofing is the example. The product is horizontal. DEFENSE-IN-DEPTH Lobster Trap is the inspection floor. AgentMarshal is the policy ceiling. Two layers, two jobs. One catches the conversation. The other catches the consequence. Tech: Next.js 14, TypeScript, YAML-driven policy engine (38/38 tests), Veea Lobster Trap (Go sidecar, MIT, unmodified), Ollama (local) / Groq llama-3.1-8b-instant (prod), SQLite, Fly.io. MIT licensed.
19 May 2026