Open Responses: open inference standard for Agents

Open Responses arrives as an open standard for inference designed specifically for the era of autonomous agents. It was initiated by OpenAI and developed by the open source community together with the Hugging Face ecosystem, and it aims to replace the limitations of traditional chat formats.

What is Open Responses

Open Responses is an open specification based on the Responses API (launched in 2025) that unifies how models generate text, images, structured JSON and videos, and how they execute agentic loops that call tools and return final results.

Why does it matter? Because most of the ecosystem still uses formats built for turn-by-turn conversations, and those don't fit well when systems need to reason, plan and act over long time horizons. Open Responses proposes a standardized, extensible format better suited for those flows.

Design and key technical changes

Stateless by default: the standard assumes independent requests, with support for encrypted reasoning when the provider requires it.

What is Open Responses

Design and key technical changes

Streaming and reasoning

Internal vs external tools

Standardized agent loop

Migration: clients, providers and routers

Implications for security, observability and operations

Practical recommendations for developers

Original source

Stay up to date!

Open Responses: open inference standard for Agents