Claude Code Max subscribers burned full monthly quotas in under 90 minutes with no in-session visibility into abnormal token consumption or cache miss rates
Starting March 23 2026 Claude Code users on 100-200 dollar per month Max plans saw session quotas drain in 70-90 minutes instead of 5 hours on identical workloads. Root cause was a broken prompt caching layer in release v2.1.89 causing 3-50x token overconsumption per session. Developers had zero visibility: no cache hit rate display, no per-tool-call token counter, no flag when consumption was anomalous versus expected. Reddit threads accumulated 330-360 comments each within 24-72 hours as subscribers lost weeks of compute budget in days. The Anthropic postmortem confirmed three overlapping product changes caused the spike, resolved only after 4 weeks. No session-level observability tool exists that exposes prompt cache hit ratio and per-invocation token cost mid-session so developers can detect drain bugs before a monthly quota is gone.
Score Breakdown
Social Proof 2 sources
Gap Assessment
No tool exists to monitor Claude Code session health in real time: cache hit rate per invocation, actual vs expected token cost per tool call, and anomaly alerts when a session tracks 3x above baseline. Developers discover overconsumption only after the quota is depleted.