olmOCR 2: AI that improves OCR for PDF documents

olmOCR 2 arrives as a tool designed to read tricky PDFs and turn them into structured text without leaning on brittle rules. Can you imagine uploading an academic paper full of equations, tables and multiple columns and getting ready-to-use Markdown, HTML and LaTeX back? That's exactly what this release promises.

What is olmOCR 2

olmOCR 2 (olmOCR-2-7B-1025) is Ai2's new document reader that combines vision and language to transcribe complex pages in a single pass. The team presents it as an end-to-end solution that generates structure (headings in Markdown, tables in HTML, equations in LaTeX) directly in the output. (allenai.org)

The innovation: training with unit tests as a reward

The core idea here is training against things you can verify automatically. Instead of optimizing only for a similarity metric, Ai2 turned verifiable properties of the document into unit tests that return pass or fail. Those tests act as rewards during training, so the model learns to produce outputs that are verifiably correct. ()

What is olmOCR 2

The innovation: training with unit tests as a reward

Techniques and architecture in brief

Real-world performance

Speed, deployment and availability

What does this mean for you and your project?

How to get started right now

Final reflection

Stay up to date!

olmOCR 2: AI that improves OCR for PDF documents