Evaluating Small Open LLMs for Medical Question Answering: A Practical Framework
arXiv:2604.10535v1 Announce Type: cross
Abstract: Incorporating large language models (LLMs) in medical question answering demands more than high average accuracy: a model that returns substantively different answers each time it is queried is not a r…