cs.AI, cs.CL, cs.LG, cs.MA

Doctorina MedBench: End-to-End Evaluation of Agent-Based Medical AI

arXiv:2603.25821v1 Announce Type: new
Abstract: We present Doctorina MedBench, a comprehensive evaluation framework for agent-based medical AI based on the simulation of realistic physician-patient interactions. Unlike traditional medical benchmarks t…