Voxtral Transcribe 2 from Mistral: transcribe in real time

Mistral just launched Voxtral Transcribe 2, a family of speech-to-text models built for two things we all want: accuracy and speed. Want live captions with no lag, or to transcribe hours of meetings while identifying who spoke? This is aimed straight at that problem.

What Mistral offers

The offering comes in two clear flavors:

Voxtral Mini Transcribe V2: a batch transcription model with diarization, timestamps, and support for 13 languages.
Voxtral Realtime: designed for live applications, with configurable latency down to under 200 ms. Its weights are open under the Apache 2.0 license.

Mistral also added an audio playground in Mistral Studio so you can try transcription instantly, with diarization and timestamps.

The most important

Ultra-low latency: Realtime can run below 200 ms, ideal for voice agents and smooth conversational experiences.

What Mistral offers

The offering comes in two clear flavors:

Voxtral Mini Transcribe V2: a batch transcription model with diarization, timestamps, and support for 13 languages.
Voxtral Realtime: designed for live applications, with configurable latency down to under 200 ms. Its weights are open under the Apache 2.0 license.

Mistral also added an audio playground in Mistral Studio so you can try transcription instantly, with diarization and timestamps.

The most important

Ultra-low latency: Realtime can run below 200 ms, ideal for voice agents and smooth conversational experiences.

What Mistral offers

The most important

What Mistral offers

The most important

Voxtral Realtime

Voxtral Mini Transcribe V2

Audio playground in Mistral Studio

Real-world use cases

Privacy, deployment and pricing

Who this is for and why it matters

Original source

Stay up to date!

Voxtral Transcribe 2 from Mistral: transcribe in real time