Agents hallucinate tool calls for non-existent tools and corrupt structured outputs when switching between OpenAI UI and Responses API

After OpenAI shipped the Responses API (replacing Assistants API, August 2026 shutdown deadline), developers hitting the community forum report two compounding failure modes. First, the model generates tool calls invoking tools not defined in the schema (hallucinated 'search' function with valid-looking parameters) while simultaneously producing a useful response -- leaving the handler in an ambiguous state with no clean error path. Second, agents configured identically in the Agent Kit UI vs the Responses API produce dramatically different outputs: zero hallucinations in UI, fabricated UUIDs in API integration. The discrepancy traces to hidden configuration differences (temperature defaults, context window tool-output integration, implicit strictness controls) that are not documented. Strict mode (constrained decoding) exists as an opt-in fix but is not the default, meaning most production deployments silently tolerate type mismatches, missing required fields, and invalid enum values. The earlier Assistants-to-Responses migration CLI signal (already in DB) focused on migration tooling; this signal is distinct: it is about runtime output integrity of already-migrated agents in production -- specifically the UI-vs-API behavioral gap and hallucinated tool invocations post-migration.

Score Breakdown

OPENAI_FORUM

201

Social Proof 1 sources

Significant Hallucination Discrepancy: Agent Kit Chat Tab vs API Workflow Integration

Aviv_Ohayon · 1/5/2026

201

Gap Assessment

UnderservedExisting solutions leave gaps

Guardrails AI and Instructor add schema enforcement but require wrapping every call and are not aware of the UI/API behavioral gap. No product does: a drop-in Responses API proxy that enforces strict schema on every tool call, detects and surfaces the UI-vs-API config delta before deployment, and logs hallucinated-tool invocations for replay. Narrow focused gap, no funded competitor in this exact slot.

Virality Score

201

across 0 platforms

Details

Signalissue

Ecosystemai_agent_mcp

Sources1

Platforms0

Updatedunknown

Trend→ stable

Top ideas

All ideas →

0An SDK that generates compliant EU Data Act switching endpoints for SaaS providers 0An API that handles multi-state age verification and verifiable parental consent for indie app developers 0A mobile app health engine that scores indie apps against Apple removal criteria and runs re-engagement campaigns before the 90-day cutoff