Turbo-OCR for high-volume image and PDF processing
r/LocalLLaMA
•
Computer Vision
AI Hardware
AI Research
AI Tools
I recently had to process ~940,000 PDFs. I started with the standard OCR tools, but the bottlenecking was frustrating. Even on an RTX 5090, I was seeing low speed. The Problem: PaddleOCR (the most popular open source OCR): Maxed out at ~15 img/s. GPU utilization hovered around 15%. Their high performance inference mode doesn't Blackwell GPUs yet (needs CUDA < 12.8) and doesn't work with the latin recognition model either. VLM OCR (via vLLM): Great accuracy, but crawled at 2 img/s. At a million pages, the time/cost was prohibitive.