Turbo-OCR Update: Layout Model + Multilingual

Follow-up to my post 18 days ago about the C++/CUDA OCR server. Two additions:

What's New:

  • Layout model: Added PP-StructureV3 for layout detection
  • Multilingual: No longer Latin-only. Now supports Chinese, Japanese, Korean, Cyrillic, Arabic, and Latin-script languages.

Same stack: C++, TensorRT FP16, multi-stream, gRPC/HTTP, direct pdf endpoint.

Benchmarks (Linux / RTX 5090 / CUDA 13.2):

  • Very text-heavy images: 100+ img/s
  • Sparse/Low-text: 1,000+ img/s
  • 270p/s on FUNSD Dataset

Source: github.com/aiptimizer/TurboOCR

submitted by /u/Civil-Image5411
[link] [comments]

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top