A CLI tool that measures AI coding agent output per dollar spent by correlating git activity with provider billing to answer whether agent compute is worth the cost
OpenClaw's creator just posted a $1.3M monthly OpenAI bill from 100 Codex agents, and the first community reaction was 'show me something $1M worth of engineers couldn't do.' Current tools track token costs, but nobody tracks what those tokens actually produced. Teams running 10-100 coding agents have no way to know if a $50K/month agent fleet ships more PRs, fixes more bugs, or reviews more code than the equivalent headcount. This tool hooks into git repos and LLM provider billing APIs to calculate cost-per-PR-merged, cost-per-bug-fixed, and cost-per-review-completed, then compares agent productivity against team baselines.
Demand Breakdown
Social Proof 2 sources
Gap Assessment
4 tools exist (Helicone, Braintrust, Langfuse, Tokscale) but gaps remain: No git correlation, no cost-per-PR, no agent-vs-human baseline comparison. Tracks spend but not output.; No mapping of costs to actual shipped deliverables (PRs, issues, reviews). Tracks tokens, not productivity..
Features4 agent-ready prompts
Competitive LandscapeFREE
| Product | Does | Missing |
|---|---|---|
| Helicone | Gateway-attached LLM cost analytics with per-request logging, model comparison, and alerting | No git correlation, no cost-per-PR, no agent-vs-human baseline comparison. Tracks spend but not output. |
| Braintrust | LLM observability with traces capturing every call, retrieval step, and tool invocation with cost attached | No mapping of costs to actual shipped deliverables (PRs, issues, reviews). Tracks tokens, not productivity. |
| Langfuse | Self-hosted LLM observability with cost dashboards, evaluation framework, and prompt management | No git integration, no waste detection, no agent-vs-human ROI calculator. Cost visibility without output measurement. |
| Tokscale | CLI tracking token usage from OpenClaw, Claude Code, Codex with leaderboard and contributions graph | Tracks raw token counts only. No cost-per-deliverable, no waste detection, no ROI comparison. |
Sign in to unlock full access.