TexOCR: Advancing Document OCR Models for Compilable Page-to-LaTeX Reconstruction
arXiv:2604.22880v1 Announce Type: new
Abstract: Existing document OCR largely targets plain text or Markdown, discarding the structural and executable properties that make LaTeX essential for scientific publishing. We study page-level reconstruction o…