PaddleOCR 3.5 adds Transformers backend for OCR

PaddleOCR 3.5 arrives to give you more flexibility when turning documents into useful data. What changes? Now you can run OCR and document-parsing models provided by PaddleOCR using transformers as an inference backend, which makes it easier to plug them into stacks centered on Hugging Face and PyTorch.

What PaddleOCR 3.5 brings

The main novelty is a more flexible inference interface: the engine parameter lets you pick the backend and engine_config accepts backend-specific options. In practice this means:

PaddleOCR still manages the internal OCR and document-parsing pipelines, so you don’t have to call every component manually.
transformers becomes a supported backend to run compatible PaddleOCR models.
You can configure options like dtype, device placement and attention implementation through engine_config.

A simple way to understand the stack:

What PaddleOCR 3.5 brings

The main novelty is a more flexible inference interface: the engine parameter lets you pick the backend and engine_config accepts backend-specific options. In practice this means:

PaddleOCR still manages the internal OCR and document-parsing pipelines, so you don’t have to call every component manually.
transformers becomes a supported backend to run compatible PaddleOCR models.
You can configure options like dtype, device placement and attention implementation through engine_config.

A simple way to understand the stack:

Layer	What it means	Examples
Application layer	Applications that consume OCR and document parsing	RAG, agents, Document AI
Model layer	OCR and parsing capabilities	PP-OCRv5, PaddleOCR-VL 1.5
Inference backend layer	Runtime to execute the models	paddle_static, paddle_dynamic, transformers

What PaddleOCR 3.5 brings

What PaddleOCR 3.5 brings

Practical examples (installation and usage)

When to use the Transformers backend and when not?

Technical recommendations and best practices

Impact for Document AI projects

Original source

Stay up to date!

PaddleOCR 3.5 adds Transformers backend for OCR