Rethinking Pulmonary Embolism Segmentation: A Study of Current Approaches and Challenges with an Open Weight Model
arXiv:2509.18308v3 Announce Type: replace
Abstract: Pulmonary Embolism (PE) is a life-threatening condition for which accurate and timely detection is critical to patient care. However, our systematic study of PE segmentation algorithms reveals concerning limitations in the current state of research. Challenges such as small and inconsistent datasets, a lack of reproducible baselines, and limited comparative evaluation across models are hindering progress in the field. In this study, we curated a densely annotated dataset comprising 490 CTPA scans, each from a unique patient (430 for training and 60 for testing). We evaluated nine widely used segmentation architectures, including both CNN- and ViT-based models, in 2D and 3D configurations, using mean Dice Similarity Coefficient (mDSC) and Average Symmetric Surface Distance (ASSD) as evaluation metrics. Furthermore, the highest-performing model was evaluated on a public dataset without fine-tuning and achieved reasonable generalization performance. Our results show that: (1) a 3D U-Net with ResNet encoding blocks remains a highly effective architecture for PE segmentation; (2) 3D models consistently outperform their 2D counterparts; (3) across all architectures, when trained and evaluated on the same datasets, model error patterns are highly consistent; and (4) distal emboli remain particularly challenging due to both task complexity and the scarcity of high-quality datasets, highlighting the need for datasets with more comprehensive and consistent distal PE coverage. To promote research reproducibility, the architecture and pretrained weights of our best-performing model are publicly available at https://github.com/mazurowski-lab/PulmonaryEmbolismSegmentation