Hugging Face launches Storage Buckets for ML artifacts

Imagine a place on the Hub where you drop everything that’s “in motion”: checkpoints, processed shards, logs and traces. Storage Buckets arrive to cover exactly that: mutable, S3-like storage designed for ephemeral, high-performance artifacts that ML generates in production.

What Storage Buckets are and why they matter

A Bucket is a non-versioned container inside the Hub. It lives under your user or organization, respects Hugging Face permissions, can be private or public, has a web page and a programmatic address like hf://buckets/user/my-bucket.

Why not use Git for this? Have you seen how noisy a training run gets when it writes checkpoints every few minutes? Git wasn’t built for large, mutable objects that change constantly. Buckets are designed to write fast, overwrite when needed, sync directories and remove obsolete files without fuss.

The technical advantage: Xet and chunk deduplication

Buckets are built on Xet, the chunk-based storage backend. Instead of treating every file as a monolithic blob, Xet splits content into pieces and deduplicates across them.

What Storage Buckets are and why they matter

The technical advantage: Xet and chunk deduplication

Global performance and pre-warming by region

Quickstart with the CLI

Programmatic integration: Python, JavaScript and fsspec

Good pattern: mutable layer vs versioned layer

Early experiences and adoption

Original source

Stay up to date!

Hugging Face launches Storage Buckets for ML artifacts