OLMoASR introduces open speech recognition models

Voice is the most natural interface we have. Why are most robust speech recognition systems still behind closed doors? Ai2 has just released OLMoASR, a family of fully open ASR models that aims to change that by offering weights, data and reproducible code for the whole community. (allenai.org)

What is OLMoASR

OLMoASR is a series of automatic speech recognition (ASR) models trained from scratch on a large, curated dataset. The central idea is to show that with well-filtered data, an open model can match or come close to the performance of widely used proprietary systems, like Whisper. (allenai.org)

Initial models released:
- OLMoASR-tiny.en (39M parameters)
- OLMoASR-base.en (74M parameters)
- OLMoASR-small.en (244M parameters)

What is OLMoASR

Why it matters (and what they did differently)

Key results (in simple terms)

How it can serve you today

Limitations and open questions

Final reflection

Stay up to date!

OLMoASR introduces open speech recognition models