PRISM-CTG: A Foundation Model for Cardiotocography Analysis with Multi-View SSL
arXiv:2605.02917v1 Announce Type: new
Abstract: Supervised deep learning models for automated CTG analysis are typically constrained by narrowly curated labelled datasets and limited patient cohorts, leaving substantial volumes of physiologically informative clinical recordings untapped. To address this limitation, we propose Physiology-aware Representation Learning via Integrated Self-supervision and Metadata for CTG (PRISM-CTG), a clinically grounded self-supervised foundation model (FM) for CTG that leverages large-scale unlabelled recordings to learn transferable domain-level representations. PRISM-CTG is pretrained using a multi-view self-supervised framework that jointly optimises 3 complementary pretext objectives: random-projected guided masked signal reconstruction, clinical variable prediction, and feature classification. Each objective is associated with a dedicated task-specific token, enabling specialised representation learning, while controlled cross-attention facilitates information exchange across clinical context. By reframing patient metadata and domain knowledge, which are often underutilised in conventional training as prediction targets, Prism-CTG transforms readily available clinical information into additional supervisory targets that guide clinically meaningful representation learning. Extensive experiments across 7 downstream CTG tasks in both antepartum and intrapartum domains demonstrated that PRISM-CTG consistently outperforms in-domain and SSL baselines. Notably, PRISM-CTG demonstrated strong generalisation under external validation on 2 datasets, while achieving comparable performance to studies trained on substantially larger, privately labelled datasets. To our knowledge, this is the first study to introduce large-scale FM for CTG that learns domain-level representations.