HealthBench Professional: Evaluating Large Language Models on Real Clinician Chats
arXiv:2604.27470v1 Announce Type: new
Abstract: Millions of clinicians use ChatGPT to support clinical care, but evaluations of the most common use cases in model-clinician conversations are limited. We introduce HealthBench Professional, an open benc…