Nicholas Sadjoli, Tim Siefken, Atin Ghosh, Yifan Mai, Daniel Dahlmeier

Optimization before Evaluation: Evaluation with Unoptimised Prompts Can be Misleading

Nicholas Sadjoli, Tim Siefken, Atin Ghosh, Yifan Mai, Daniel Dahlmeier / May 1, 2026

arXiv:2604.27637v1 Announce Type: new
Abstract: Current Large Language Model (LLM) evaluation frameworks utilize the same static prompt template across all models under evaluation. This differs from the common industry practice of using prompt optimiz…

Author name: Nicholas Sadjoli, Tim Siefken, Atin Ghosh, Yifan Mai, Daniel Dahlmeier

Optimization before Evaluation: Evaluation with Unoptimised Prompts Can be Misleading