Transformers.js tests Cross-Origin Storage to speed up models | Keryc
Transformers.js now lets you experiment with a simple but powerful idea: sharing large model files and runtimes in the browser across different origins using cryptographic hashes. Why does it matter? Because it prevents duplicate downloads, saves disk space, and makes your web apps faster without drastic changes to your code.
What problem it solves
Imagine you visit two different sites that use the same Whisper model and the same Wasm runtime. Even if the URLs point to the same CDN, modern browsers partition caches by origin (for security) using a key called Network Isolation Key. The result: identical downloads repeated and duplicated storage.
In the Transformers.js example in the article, that becomes 177 MB of duplicate data just from testing two origins with the same model. Small for one app, huge if you add many visits and many apps.
How Cross-Origin Storage (COS) works
The Cross-Origin Storage proposal introduces navigator.crossOriginStorage, an API that identifies files by their cryptographic hash (for example, SHA-256) instead of by URL. The basic flow is:
You request a FileHandle by hash with requestFileHandle().
If it already exists in COS, you get the file instantly as a File/Blob.
If it doesn't exist, you download it and write it into COS with create: true and the appropriate origins option.
The hash guarantees integrity: the browser verifies that the bytes written match the declared hash. If they don't match, the write fails.
Visibility is controlled with the origins option: '*' for global, a list of origins for private sharing between sites you control, or absence for same-site only.
Visibility can increase but never decrease: if a file is already public, it can't become private again by rewriting.
Privacy and risks considered
Yes, sharing by hash opens the possibility of probing whether a resource exists on the device. COS mitigates this with two mechanisms:
The recommendation that sensitive resources not be stored with origins: '*'.
Availability gating: the browser can hide the existence of uncommon resources until they appear from enough distinct origins. In practice, an error from requestFileHandle() doesn't distinguish between "not present" and "present but withheld"; that's why your app should always fall back to the network download if it doesn't obtain the resource.
Practical integration in Transformers.js
Transformers.js already ships an experimental COS backend: enable a line before creating pipelines and you're set:
The library resolves the SHA-256 of each large file (ONNX weights, etc.) from the Xet pointer and uses that hash with navigator.crossOriginStorage. If another site already stored the same file, Transformers.js receives it from COS with no RTT. If not, it downloads and writes it into COS for future users.
Real benefit: the shared Wasm runtime (for example ort-wasm-simd-threaded.asyncify.wasm) and model weights are downloaded only once on the user's device, no matter how many different origins request them.
How to try it today
The COS API isn't natively implemented in browsers yet, but you can try it now with the Cross-Origin Storage extension that injects a polyfill for navigator.crossOriginStorage into pages. Quick steps:
Install the extension (it's aimed at developers and power users).
Enable env.experimental_useCrossOriginStorage = true in your Transformers.js app before the first pipeline().
Open the same demo from two different origins with the extension active and you'll see the second load get the model from COS in milliseconds instead of re-downloading 177 MB.
The extension also shows a panel with resources by hash and the origins sharing them, useful for debugging.
Ecosystem status and call to developers
Transformers.js isn't the only library experimenting: projects like WebLLM and wllama are already trying COS. The Chrome team is evaluating a native implementation and the proposal is evolving. If you're interested in influencing the design or reporting issues, the Cross-Origin Storage repository accepts issues and PRs.
If you build in-browser model apps, opting into experimentation now helps speed up the Web for everyone: each site that enables COS reduces latency and data usage for others. And the best part: if the browser doesn't support COS, your code falls back to the traditional cache without breaking.
Today COS is a proposal with work ahead, but it's practical to test and offers concrete benefits: fewer duplicate downloads, automatic integrity checks, and more efficient use of local storage.