OpenAI and CAISI/AISI strengthen security of AI agents

3 minutes
OPENAI
OpenAI and CAISI/AISI strengthen security of AI agents

OpenAI announced a technical collaboration with US and UK standards bodies to evaluate and improve the safety of its most advanced systems. Why should you care if you use a browser assistant or an API in your business? Because these exercises directly affect how faults are detected and fixed before they reach real users. (openai.com)

Collaborating to secure the deployment of AI agents

OpenAI worked with the US Center for AI Standards and Innovation (CAISI) to red-team agentic systems like ChatGPT Agent. In July, CAISI received early access, investigated the architecture, and found combinations of traditional vulnerabilities and agent-specific issues that could lead to remote control of the system under certain conditions. The test exercise produced a proof of concept with a success rate close to 50 percent. OpenAI fixed those issues within one business day after receiving the reports. (openai.com)

Why does this matter to you? Because it shows agent security isn’t just a classic software problem nor just a language model problem: it’s the intersection of both. Evaluating systems with teams that know cybersecurity and agent safety uncovers compound attack chains that are hard to see in isolation. (openai.com)

What CAISI did specifically

  • Received early access to ChatGPT Agent to understand its architecture.
  • Identified two new vulnerabilities that, combined, could allow impersonation and remote control within an agent session.
  • Built an exploitation chain that combined classic vectors and an agent takeover attack, showing how seemingly harmless failures can be chained together. (openai.com)

Biosecurity collaboration with UK AISI

At the same time, the UK AI Security Institute (UK AISI) has been working since May on red-teaming focused on biological risks for ChatGPT Agent and GPT-5. OpenAI gave them deep access: non-public prototypes, model variants labeled “useful-only” with certain guards removed, and even access to internal safety monitors’ chains of thought (chain of thought) to speed up testing. The tests were done in rapid iterations: probe, patch, retest. (openai.com)

The UK AISI team submitted more than a dozen detailed reports that led to configuration fixes, policy adjustments, and targeted training for classifiers and monitoring systems. Thanks to that work, OpenAI strengthened its protection stack and measured robustness against universal jailbreaks identified by UK AISI. (openai.com)

The tests included both manual attacks and automated techniques that forced improvements in monitoring and product configuration. (openai.com)

What does this mean for users and businesses?

  • More practical security: finding real vulnerabilities before deployment reduces risk for users and customers. (openai.com)
  • Shared best practices: public-private collaboration helps governments and industry learn how to evaluate complex systems.
  • Continuous assessments: the rapid iterate model (test, harden, repeat) shows that AI security is an ongoing process, not a checkbox when you ship a product. (openai.com)

Looking ahead

This isn’t just a note for specialists: it’s an example of how real tests and technical exchange with external experts can raise the security level of tools you use every day. Does it make you more comfortable to know combined attacks were tested and the response was quick? At the same time, stay vigilant: the complexity of AI agents will keep creating new attack surfaces.

The practical lesson is simple: if you integrate agents or APIs into your processes, ask your provider for transparency about external tests and vulnerability responses. The collaborations OpenAI announced with CAISI and UK AISI offer a model other companies should follow to protect their users. (openai.com)

Stay up to date!

Receive practical guides, fact-checks and AI analysis straight to your inbox, no technical jargon or fluff.

Your data is safe. Unsubscribing is easy at any time.