I have a 16GB VRAM GPU and I'm looking for a reliable local OCR model.
Ideally it should stay under ~60% VRAM usage, so around 9–10GB max, because I want to keep it available on-demand rather than loading a huge model only for occasional batch jobs.
There are a lot of OCR models claiming to be "the best", but I care more about reliability and practical day-to-day use than benchmark hype.
Use cases:
- screenshots
- scanned documents / PDFs
- eceipts or forms
- general image-to-text extraction
I'm looking at options like PaddleOCR, Surya, Tesseract, and maybe small vision-language models, but I'm not sure what people are actually using locally in 2026.
What would you recommend for a good balance of accuracy, VRAM usage, and reliability?
[link] [comments]