Author name: Sebastian Nagl, Matthias Grabmair

BenGER: A Collaborative Web Platform for End-to-End Benchmarking of German Legal Tasks

Sebastian Nagl, Matthias Grabmair / April 23, 2026

arXiv:2604.13583v2 Announce Type: replace-cross
Abstract: Evaluating large language models (LLMs) for legal reasoning requires workflows that span task design, expert annotation, model execution, and metric-based evaluation. In practice, these steps a…

cs.AI, cs.CL

BenGER: A Collaborative Web Platform for End-to-End Benchmarking of German Legal Tasks

Sebastian Nagl, Matthias Grabmair / April 16, 2026

arXiv:2604.13583v1 Announce Type: cross
Abstract: Evaluating large language models (LLMs) for legal reasoning requires workflows that span task design, expert annotation, model execution, and metric-based evaluation. In practice, these steps are split…