Claude Code 5-minute cache TTL regression causes 12-20x cost spikes for normal pausing workflows

Anthropic silently downgraded Claude Code's default prompt cache TTL from 1 hour to 5 minutes around March 6, 2026. Any pause longer than 5 minutes (reviewing code, thinking, handling an interruption) causes the entire cached context to expire. The next prompt forces a full cache write at 12.5x the cost of a cache read. A session expected to cost $0.50/hr now burns $5-10/hr with no visible warning. Analysis of 119,866 API calls showed a 17.1% overall cost overage and 25.9% waste in March 2026 alone. The HN postmortem thread (HN #47878905) got 942 points and 732 comments, the highest-engagement Claude Code issue of the year. Developers on subscription plans hit their 5-hour quota limits for the first time. There is no built-in cache status indicator in the Claude Code CLI, no countdown showing when the cache will expire, no option to pay for the existing 1-hour TTL, and no session heartbeat to keep the cache warm. Two community projects exist (cnighswonger/claude-code-cache-fix at 297 stars, yujiachen-y/claude-code-cache-keepalive at 6 stars) but both are fragile hacks that are not shipped as proper tools. No product offers a Claude Code companion that monitors cache state in real time, shows a countdown, warns before sending a prompt that will hit a cold cache on a large context, and optionally injects keepalive signals.

Product Idea from this Signal

A web app that attributes and hard-caps AI coding assistant spend across seats, credit pools, and agent runs for engineering orgs

85.3k ▲

Engineering teams using Claude Code, GitHub Copilot, Cursor, and AI agents across multiple seats have no single place to see who is spending what, enforce a shared credit-pool budget before it is exhausted, or charge spend back to a project or team. Anthropic split credit pools in June 2026; GitHub Copilot moved to metered AI Credits on June 1, 2026; Uber burned its full 2026 AI coding budget in four months; Microsoft ordered engineers off Claude Code over uncontrolled token bills. LLM gateways like LiteLLM, Bifrost, and Helicone track per-virtual-key API spend but do not cross-reconcile seat-level coding-assistant usage across providers, enforce hard budget caps with cutoff enforcement, or produce chargeback reports by team or project. This product ingests usage across all major AI coding tools and agent frameworks, attributes every dollar to a seat, team, and project in real time, enforces hard caps before a shared pool is exhausted, and produces showback and chargeback reports for finance.

ai-finopsengineering-spendcredit-poolsper-seat-attributionai-coding-governance

Competitive1000 leadsView Opportunity →

Score Breakdown

1,943

GitHub

600

Social Proof 4 sources

An update on recent Claude Code quality reports

mfiguiere · 4/23/2026

1,674 GH

claude-code-cache-fix: Fixes prompt cache regression causing up to 20x cost increase

cnighswonger · 3/20/2026

319 GH

Cache TTL silently regressed from 1h to 5m around early March 2026, causing quota and cost inflation

seanGSISG · 4/12/2026

281 HN

Anthropic downgraded cache TTL on March 6th

community · 4/5/2026

269

Gap Assessment

UnderservedExisting solutions leave gaps

HN thread 942 pts/732 comments is massive signal. Two community hacks exist but neither is a real product with a UI, subscription model, or enterprise support. The 1-hour TTL path requires undocumented env var. Gap: a real Claude Code cost-visibility companion that surfaces cache state and warns before cost spikes. No funded product in this space.

Virality Score

2,543

across 0 platforms

Details

Signalissue

Ecosystemai_agent_mcp

Sources4

Platforms0

Updatedunknown

Trend→ stable

Top ideas

All ideas →

0A static linter that audits MCP server code for 2026-07-28 stateless spec compliance and flags every breaking change before it ships 0A CLI tool that automates full migration from Gemini CLI to Antigravity CLI 0A web app that attributes and hard-caps AI coding assistant spend across seats, credit pools, and agent runs for engineering orgs

Related signals

All signals →

26.1MClaude Code 512K-Line Source Code Leaked via npm Source Map 1.2MAnthropic Doubles Claude Code Limits After SpaceX Compute Deal 81.4KAnthropic Agent SDK June 15 billing split strands builders on $20-200 monthly credit caps with no overflow routing