OpenAI collaborates with the US and UK on AI safety

9 hours agoKeryc Díaz3 minutes

OPENAI

OpenAI published an update about two technical collaborations with agencies in the United States and the United Kingdom to evaluate and improve the safety of its AI systems. Why should you pay attention? Because these tests weren't theoretical: they found real vulnerabilities, reported them, and they were fixed quickly. (openai.com)

What happened in a few words

For more than a year OpenAI worked with the Center for AI Standards and Innovation (CAISI) in the United States and with the UK AI Security Institute (UK AISI). The goal: put OpenAI's models and products through deep security testing and practical red teaming, both in domains tied to national security and in emerging vectors specific to agent systems. (openai.com)

Collaboration on agent security

CAISI got early access to ChatGPT Agent and ran an assessment that mixed cybersecurity expertise with agent-security knowledge. They discovered two novel vulnerabilities that, when chained with an agent hijacking attack, could —under certain conditions— let an attacker remotely control systems the agent could access and even impersonate the user on other sites. The team built a proof of concept with an approximate 50 percent success rate. OpenAI received the reports and fixed the issues within one business day. (openai.com)

Surprised that an AI agent can be part of an attack chain? Should you be? It's exactly the lesson: AI security isn't only about models versus prompts anymore. It's where traditional software vulnerabilities meet new ways to exploit what agents can do. Think of it like a compromised smart assistant scheduling harmful actions or a bot being tricked into giving away access—these are everyday analogies that make the risk tangible. (openai.com)

Collaboration on biosecurity

With UK AISI, the focus was defenses against biological misuse. Starting in May, UK AISI ran red teaming on oversight safeguards and policies in non-public prototypes, on "helper-only" model variants with tightened restrictions, and even on internal monitor chains of thought to find failures faster. The work was iterative: test, OpenAI fixes, new test. In that process UK AISI submitted more than a dozen detailed reports that led to engineering fixes, tweaks in policy enforcement, and targeted training of classifiers. (openai.com)

Why does this matter now?

Because it shows that continuous, practical evaluation with outside experts uncovers issues that internal tests might miss. (openai.com)
Because public-private collaboration brings national-security, cybersecurity, and metrology expertise that complements a company's internal know-how. (openai.com)
Because these weren't PR exercises: they produced concrete changes in products used by millions. (openai.com)

What this means for users and companies

For you as an end user, it means platforms that adopt these practices can spot and mitigate more complex attacks before they hit your account or devices. For companies that integrate AI agents, the lesson is clear: demand independent security assessments and understand the extra attack surface an agent adds to your systems. For regulators, it's a practical example of how to design collaborative technical processes instead of relying only on administrative rules. (openai.com)

Final reflection

The key point isn't just that flaws were found, but that the collaborative mechanism allowed them to be fixed quickly and learned from. Is that enough for you? It depends on how much we value transparency, ongoing audits, and shared responsibility between companies and governments. In the short term, the practical route is to encourage more technical exercises like this and for companies to publish verifiable results from those tests.

Read OpenAI's original note for technical details and timeline: OpenAI update on CAISI and UK AISI. (openai.com)

Stay up to date!

Receive practical guides, fact-checks and AI analysis straight to your inbox, no technical jargon or fluff.