Interactions API: Google unifies the interface for Gemini | Keryc
Google announces that the Interactions API is now generally available as its main interface to work with Gemini models and agents. What does this mean for you as a developer, entrepreneur, or curious tinkerer? In short: a simpler, more powerful way designed for stateful workflows and agents that run autonomous tasks.
What is the Interactions API
The Interactions API is the new recommended entry point to call models (passing a model ID) or run autonomous agents (passing an agent ID). Got a long task or one that needs to run code and browse the web? Just add background=True and the server will handle the interaction in the background.
It’s not just another API: Google built it to support stateful interactions, tools, multimodality, and agents that perform multiple steps. That’s why it’s now the default interface in Google AI Studio, the Gemini API, and the official docs.
Key updates since beta
Managed Agents: with a single call you can provision a remote Linux sandbox where an agent reasons, runs code, browses, and manages files. The Antigravity agent comes by default, and you can create custom agents with instructions, skills, and data sources.
Background execution: mark background=True and the interaction runs asynchronously on the server. Ideal for processes that take time or require long steps.
Improved tools: you can mix built-in tools like Google Search and Google Maps with your own functions in a single request. Tool results can now return images as well as text.
Deep Research upgrades: agent versions optimized for speed or depth, collaborative planning, native charts and infographics, and multimodal grounding with images, PDFs, and audio.
Media generation: images with Nano Banana 2 and grounding in Google Image Search, music with Lyria 3, and expressive multi-speaker TTS voices.
From Roles to Steps: the new schema simplifies the flow: each action (user_input, thought, function_call, model_output, etc.) is a typed step instead of the old roles.
Costs and optimizations: new Flex and Priority tiers to tune cost or latency (Flex promises up to 50% cost reduction). Errors now point to the exact field. In the paid tier you can retrieve past interactions with 55 days retention.
Why should you care?
If you build apps that use agents, memoryful workflows, or multimodal generation, this API makes it easier to integrate everything with less code and better patterns. Are you an entrepreneur building an assistant that writes code, analyzes documents, or produces media? Managed Agents and background execution do things that used to require complex infrastructure.
For non-technical users, the consequence is simple: smarter apps that can keep long context, take real actions, and produce multimodal content (text, images, audio) without needing constant human waiting.
Migration and compatibility
The old generateContent API remains supported and will continue to receive major models, but Google recommends starting new projects with the Interactions API. Frontier features for long models and agents will likely appear first in the Interactions API, since it was designed for agentic and stateful flows.
Google published a migration guide that maps every field from the old schema to the new one, and also offers a Skill called gemini-interactions-api that injects recommended patterns into your agent context (streaming, function calls, structured output, Deep Research).
How to get started today
The Interactions API is available in the Python and JavaScript SDKs.
Integrations with partners like LiteLLM, Eigent, and Agno already support the Interactions API.
Grab your API key from Google AI Studio and follow the docs/migration guide to adapt existing projects.
Think of this as the standardizer: less glue work between services, more focus on your product logic and the user experience.
Final reflection
It’s not just a technical update; it’s a bet on agentic flows that act in the digital world more reliably and at scale. If you’re building assistants, productivity agents, or multimodal pipelines, it’s worth trying the Interactions API now and planning your migration calmly.