Machine Learning-Driven Multimodal Spectroscopic Liquid Biopsy for Early Multicancer Detection
arXiv:2605.13218v1 Announce Type: new
Abstract: Cancer is one of the leading causes of death worldwide, making the development of rapid, minimally invasive, label-free and scalable diagnostic strategies a major challenge in modern oncology. In this context, spectroscopic liquid biopsy has emerged as a promising alternative, as it enables the holistic characterization of biochemical alterations in biological fluids. In this work, we propose a multimodal spectroscopic liquid biopsy framework for multicancer detection based on the combination of Fourier Transform Infrared (FTIR) spectroscopy, Raman spectroscopy, and Excitation-Emission Matrix (EEM) fluorescence spectroscopy together with Machine Learning (ML) methodologies. Serum samples from breast cancer patients, colorectal cancer patients, and healthy controls were analyzed through the three spectroscopic modalities. After modality-specific preprocessing, low-level data fusion (LLDF) was employed to integrate the complementary biochemical information encoded within the different spectroscopic measurements, and classification was performed using XGBoost models. Seven experimental configurations were evaluated, including the three unimodal approaches, all pairwise bimodal configurations, and the full multimodal approach of FTIR, Raman, and EEM fluorescence. The results show that although several individual modalities achieved high discrimination performance, the multimodal fusion provided the most balanced overall results, reaching a ROC-AUC of 0.997 for breast cancer and 0.994 for colorectal cancer, together with highly balanced sensitivity and specificity values.