ParseBench: The First Document Parsing Benchmark for AI Agents ‌‍‍‍‌‍‌‍‌‍‍‌‌‍‌‌‍‍‌‌‍‍‍‍‍‍‍‍‌‌‍‌‌‍‍‌‍‍‌‌‌‌‍‌‍‍‌‍‍‌‌‍‍‍‍‍‍‌‍‍‌‍‌‍‌‌‌‍‌‍‍‍‍‍‍‍‌‍‍‌‌‌‌‌‌‍‍‍‍‌‌‌‌‌‌‍‍‌‌‍‌‌‍‍‌‍‍‌‌‌‌‍‌‍‍‌‍‍‌‌‍‍‌‌‍‌‍‍‌‍‌‌‍‌

We just released ParseBench, a benchmark designed to evaluate how well document parsers and OCR systems actually work when feeding data into AI agents.

There are a ton of OCR and parsing benchmarks out there, but for us, none of them were capturing the issues and customer requirements that we were reporting. Most datasets cover simple documents or have limited eval rules.

ParseBench is an open-source benchmark of ~2,000 human-verified enterprise document pages with 167,000+ test rules across five key dimensions: tables, charts, content faithfulness, semantic formatting, and visual grounding.

The dataset is built from real-world documents across multiple industries/formats with ground-truth annotations.

All the data completely open-source, and so is the eval framework, so that people can run any parsing/OCR system on the benchmark.

A few links:

Blog
Github
Paper
Website

submitted by /u/grilledCheeseFish
[link] [comments]

Leave a Comment