cs.CL, cs.CV

GlotOCR Bench: OCR Models Still Struggle Beyond a Handful of Unicode Scripts

arXiv:2604.12978v1 Announce Type: cross
Abstract: Optical character recognition (OCR) has advanced rapidly with the rise of vision-language models, yet evaluation has remained concentrated on a small cluster of high- and mid-resource scripts. We intro…