Assay is a Band-native multi-agent system that screens stocks the way a governed desk should: AI directs, code computes, the human decides. Three agents collaborate through Band @mention chat rooms in a shared session we call a rondaan (patrol round). Skaut, the scout, surveys the data. Konduktor, the conductor, orchestrates the run and writes the reports. Pengulas, the reviewer, audits them. Between the agents sits a deterministic Python engine. No agent computes a ratio; they call the engine as a tool, so the model directs, explains, and audits but never calculates. That removes the hallucination surface where most agent demos invent a number. The wedge is correctness, not just honesty. Most governed-agent systems check whether the AI was honest about its number: did it cite evidence, was the transcript tampered? Assay checks whether the number is right. Every report carries a machine-readable claims block, and the auditor re-runs the engine on the source data and diffs the claims in pure code. Because the verdict never compares to a stored hash, it catches a figure that is wrong no matter how it got wrong, even when the report looks internally honest. A live tamper test flips the verdict from PASS to FAIL offline. The correctness control is portable: the audit is a pure function of report text plus source data with no model in the loop, so swapping the brain by one env var (GLM-5.2, Claude, Qwen3, GPT-4o) leaves the verdict byte-identical. Validated live on two brains, GLM-5.2 and Claude. 24/24 tests pass. It is a screening desk, not regulated advice.
Category tags: