Developers want a CLI that auto-detects and triages flaky tests instead of manually rerunning whole CI suites

Across major test toolchains, maintainers and users keep asking for first-class handling of flaky tests: automatic detection of nondeterministic failures, quarantining them, and retrying only the failed tests rather than re-running the entire suite. The Go team's own issue to add flaky-test support has 115 reactions and 36 comments and is still open; the JUnit request to retry flaky tests has 94 reactions and 78 comments. Today teams hand-roll retry wrappers, grep CI logs, and argue over whether flakiness is a test, product, or infra problem. The gap is a standalone tool that ingests CI test history, flags the flaky ones statistically, and gates merges on real failures only.

Score Breakdown

Issues

415

Social Proof 3 sources

Introduce support for retrying failed flaky tests

arcuri82 · 8/20/2018

172 GH

cmd/go: add support for dealing with flaky tests

bradfitz · 8/23/2023

151 GH

Flaky Tests retry all tests at end of the test run

mstepin · 5/29/2023

Gap Assessment

UnderservedExisting solutions leave gaps

Test frameworks bolt on per-framework retry flags but there is no language-agnostic tool that detects flakiness from CI history and quarantines it. CI vendors charge for it as an add-on. Room for a focused open-core CLI.

Virality Score

415

across 0 platforms

Details

Signalissue

Ecosystemdev_tool_cli

Sources3

Platforms0

Updatedunknown

Trend→ stable

Top ideas

All ideas →

0A mobile app that automatically defragments your calendar into focused work blocks 0A CLI tool that caps and halts AI coding spend before a session blows the budget 0A web app that audits, scores, and portably exports Jira automation rule sets so teams can migrate or consolidate instances without rebuilding rules from scratch

Related signals

All signals →

3.7KDevelopers want an end to end testing tool that reliably simulates real keyboard input and runs across all browsers 773Developers have no reliable way to grade whether an AI agent actually completed a multi-step task 195Builders of vertical AI phone agents need a way to test calls at scale because each use case takes months of manual prompt tuning