DeepInfra arrives on Hugging Face as an Inference Provider

DeepInfra is now a supported Inference Provider on the Hugging Face Hub. What does that mean for you? More serverless inference options, direct integration on model pages and in the Hugging Face SDKs for Python and JavaScript, all set so you can plug models into your apps without much fuss.

What the integration brings

DeepInfra is a serverless inference platform that promises one of the most competitive per-token prices on the market. Its catalog exceeds 100 models and covers everything from LLMs to text-to-image, text-to-video and embeddings.

In this initial phase the integration enables support for conversational and text-generation tasks, including open-weight models like DeepSeek V4, Kimi-K2.6 and GLM-5.1. So what does that look like in practice? On model pages you’ll see DeepInfra as a compatible option, and you can select it from widgets and code snippets without configuring complex infra.

What the integration brings

Modes of use and billing

SDKs, examples and model format

Integrations and ecosystem

Costs, PRO credits and recommendations

Good technical practices

Original source

Stay up to date!

DeepInfra arrives on Hugging Face as an Inference Provider