NAPS: Attention-Based Fusion of Heterogeneous Physiological Signals
arXiv:2511.03488v2 Announce Type: replace
Abstract: Physiological signals are inherently heterogeneous: they are collected under diverse acquisition setups, differ in the number and type of modalities and channels, varying in quality, reliability, and relevance across tasks. This variability poses a major challenge for machine learning models required to generalize across subjects, sensors, and clinical environments. Existing approaches typically train on limited modalities or single channels, leading to marginal representations that, on their own, fail to capture the systemic complexity of the physiological state; naive fusion of such representations, such as via pooling or voting schemes, is typically suboptimal, as it cannot adaptively weight different sources or capture temporal, spatial, and cross-modality dependencies. We introduce NAPS (Neural Aggregator of Physiological Signals), a neural module that performs principled data fusion to derive unified physiological representations, employing an ad hoc tri-axial attention mechanism and dimension-adaptive training to robustly manage varying high-dimensional sensor configurations. We test NAPS on automatic sleep staging from polysomnography (PSG), an ideal real-world application, where recordings consist of multiple physiological signals (EEG, EOG, EMG, ...), considerably varying in configuration across datasets and institutions. Leveraging frozen pretrained unimodal encoders, NAPS dynamically integrates representations or predictions, achieving state-of-the-art generalization across multiple datasets.