We designed a framework that reduces costs of coding models while proving to perform similar on benchmarks and real world tasks (upto 75% lower cost in our demo solar system replica), We use a collection of 8 agents with a jury of models and a planner and coder to review, reiterate, fix, plan, implement and improve. on SWE bench (lite) with 5 tasks, our model performed as high as deepseek v4 pro with 34% lower costs (scores published on github) The whole point is cost without giving up the check. The premium model runs once; the Coder and reviewers run on cheap tiers and re-run only on feedback; extra jurors convene only on disagreement. A live per-token ledger measures every call against real prices.
Category tags: