Footer navigation

Unlocking state-of-the-art artificial intelligence and building with the world's talent

LinkedIn
Twitter/X
Instagram
Discord
YouTube
Twitch

Other group brands:

Links

AI Tech
AI Hackathons
AI Tutorials
AI Applications
NativelyAI
AI Articles
Leaderboard
Writers

lablab

About
Brand
Hackathon Guidelines
Terms of Use
Code of Conduct
Privacy Policy

Get in touch

Discord
Sponsor
Cooperation
Contribute
[email protected]

© 2026 NativelyAI Inc. All rights reserved.

3.35.0

Help CenterBrowse FAQs and ask our AI.

Discord CommunityChat with mentors and the team.

AI Hackathons AI Apps AI Tech AI Tutorials AI Articles NativelyAI Sponsor

Home
App Discovery
llmtrace: which deploy spiked your LLM bill

llmtrace: which deploy spiked your LLM bill

Created by team Llmtrace on May 18, 2026

AgentOps Antigravity AI Studio Gemini AI HuggingFace Spaces

Every team shipping AI features eventually asks the same question: why did our LLM bill double last week, and what shipped that caused it? Tools like Helicone, Langfuse, and LiteLLM show you that spend went up. They do not tell you why, llmtrace closes that gap. llmtrace is a self-hosted reverse proxy for LLM provider APIs. You point your code at it instead of api.anthropic.com, and it records token usage, cost, latency, model, and a prompt fingerprint for every call into a local SQLite ledger. A rolling baseline plus sigma threshold flags per-key spend anomalies the moment they appear. When a spike is detected, a Gemini-powered agent takes over. It queries the ledger, finds deploys that landed in the surrounding time window, diffs the model and prompt mix before and after each one, and produces a causal attribution with a confidence score. It names the exact pull request responsible. If the regression is clear, the agent reads the source on GitHub, writes a corrected version, and opens a fix pull request on its own. An autonomous watcher runs this loop continuously in the background, so attributions and remediation pull requests appear without anyone asking. A multimodal Vision Import feature lets you drop in a screenshot of any billing dashboard: Gemini reads the spend off the image, then the agent investigates your connected repository to find the cause. llmtrace is written in Go, uses pure Go SQLite with no CGo, runs as a single container, and is deployed live on Google Cloud Run. The whole thing is one binary you can host yourself, with zero hosted SaaS dependency.

Category tags:

Agent Builder track - The INTERNET OF AGENTS

Github Presentation Demo

Explore more applications

Replio

An AI-powered Messaging copilot that uses a stateful agentic workflow to generate context-aware, tone-appropriate replies and alternatives, with human review and approval before sending.

Hiveminds

Token-Router: Hybrid Token-Efficient Routing Agent

An autonomous agent that answers each task with the cheapest model that can get it right. Try a weak local model first, verifying its answer for free, and escalating to Fireworks only when it has to, to minimize total tokens without sacrificing accuracy.

JoyInAI

OpenAIAMD Developer Cloudrest api

Adaptive Efficient Token Hybrid Evaluation Router

AETHER is a high-performance, token-efficient hybrid AI agent designed to process natural language tasks, intelligently routing simple/context-contained tasks to a local LLM, and falling back to Fireworks cloud models for complex reasoning.

rm -rf

AntigravityFalcon 2 11B

TokenRouter prove or escalate agent

An agent that spends Fireworks tokens only when it has to: free plain-code solvers first, then a bundled local Gemma, then Fireworks only after a verified miss. Two-thirds of tasks cost zero tokens, and a wrong free answer is structurally impossible.

FastSprint Builder

AutoPatch Forge: Autonomous Self-Healing CI/CD

An autonomous multi-agent framework built for containerized environments to intercept pipeline crashes, execute dynamic regex trace-decoding, run AST syntax-safe local patching, and automate timestamped Git PR branches in 0.02 seconds.

AutoPatch Forge

AMD Developer CloudGemma

Raghav Sharma

Upcoming AI Hackathons
For Innovators & Creators

Explore more applications

Replio

An AI-powered Messaging copilot that uses a stateful agentic workflow to generate context-aware, tone-appropriate replies and alternatives, with human review and approval before sending.

Hiveminds

Token-Router: Hybrid Token-Efficient Routing Agent

An autonomous agent that answers each task with the cheapest model that can get it right. Try a weak local model first, verifying its answer for free, and escalating to Fireworks only when it has to, to minimize total tokens without sacrificing accuracy.

JoyInAI

OpenAIAMD Developer Cloudrest api

Adaptive Efficient Token Hybrid Evaluation Router

AETHER is a high-performance, token-efficient hybrid AI agent designed to process natural language tasks, intelligently routing simple/context-contained tasks to a local LLM, and falling back to Fireworks cloud models for complex reasoning.

rm -rf

AntigravityFalcon 2 11B

TokenRouter prove or escalate agent

An agent that spends Fireworks tokens only when it has to: free plain-code solvers first, then a bundled local Gemma, then Fireworks only after a verified miss. Two-thirds of tasks cost zero tokens, and a wrong free answer is structurally impossible.

FastSprint Builder

AutoPatch Forge: Autonomous Self-Healing CI/CD

An autonomous multi-agent framework built for containerized environments to intercept pipeline crashes, execute dynamic regex trace-decoding, run AST syntax-safe local patching, and automate timestamped Git PR branches in 0.02 seconds.

AutoPatch Forge

AMD Developer CloudGemma

AI app: llmtrace: which deploy spiked your LLM bill for AI Agent Olympics Hackathon