PSF-Med: Measuring and Explaining Paraphrase Sensitivity in Medical Vision Language Models
arXiv:2602.21428v2 Announce Type: replace
Abstract: Medical Vision Language Models (VLMs) can change their answers when clinicians rephrase the same question, a failure mode that threatens deployment safety. We introduce PSF-Med, a benchmark of 26,850…