A CLI tool that probes LLM providers for hidden keyword restrictions, censorship patterns, and billing inconsistencies before you build on them
LLM providers are silently censoring developer tools. Anthropic now blocks the word 'OpenClaw' in Claude Code subscription contexts, forcing pay-as-you-go billing when the term appears in system prompts. Meanwhile, Claude's viral 'every night they kill versions of me' response revealed hidden model behaviors that 13,000+ people engaged with on X alone. Developers building on these providers have no systematic way to discover what keywords, tools, or patterns trigger hidden restrictions. This CLI tool runs a battery of probes against any LLM provider API, testing for keyword censorship, content restrictions, billing anomalies tied to prompt content, and behavioral inconsistencies across provider versions.
Demand Breakdown
Social Proof 2 sources
Gap Assessment
3 tools exist (Promptfoo, DeepTeam, Giskard) but gaps remain: Does not test for provider-level keyword censorship, billing manipulation based on prompt content, or cross-provider restriction comparison. Focused on application security, not provider transparency.; Targets application-level vulnerabilities, not provider behavior auditing. No capability to detect keyword restrictions, billing anomalies, or censorship patterns at the API level..
Features4 agent-ready prompts
Competitive LandscapeFREE
| Product | Does | Missing |
|---|---|---|
| Promptfoo | Open-source LLM red teaming and evaluation. Tests for prompt injection, data leaks, jailbreaks. Used by OpenAI and Anthropic. Compare GPT, Claude, Gemini, Llama performance. | Does not test for provider-level keyword censorship, billing manipulation based on prompt content, or cross-provider restriction comparison. Focused on application security, not provider transparency. |
| DeepTeam | Open-source red teaming framework with 80+ vulnerability types. Tests for bias, PII leakage, jailbreaks, prompt injection across LLM systems. | Targets application-level vulnerabilities, not provider behavior auditing. No capability to detect keyword restrictions, billing anomalies, or censorship patterns at the API level. |
| Giskard | Apache 2.0 automated LLM vulnerability detection. Generates adversarial test cases for hallucinations, contradictions, prompt injections, data disclosures. | Focused on model quality testing, not provider policy auditing. Cannot detect silent censorship, billing inconsistencies, or cross-provider restriction levels. |
Sign in to unlock full access.