PA-RNet: Perturbation-Aware Residual Network for Robust Multimodal Time Series Forecasting
arXiv:2508.04750v2 Announce Type: replace
Abstract: In real-world applications, multimodal time-series forecasting faces a key challenge: textual information is often useful but unreliable. Auxiliary texts may contain irrelevant, ambiguous, incomplete, or structurally corrupted content, making direct text integration prone to introducing noisy semantic signals and degrading forecasting performance. Therefore, robust multimodal forecasting requires a model that can exploit useful textual context while suppressing misleading perturbations. To address this challenge, we propose PA-RNet, a carefully designed perturbation-aware residual network for robust multimodal time-series forecasting. Rather than directly fusing textual and numerical representations, PA-RNet first refines multimodal features in a perturbation-aware manner, preserving stable contextual information while reducing unstable or misleading signals. The refined textual representations are then aligned with temporal dynamics, enabling more reliable forecasting under noisy multimodal conditions. Theoretically, we prove that PA-RNet is Lipschitz continuous with respect to textual embeddings and show that the proposed spectral residual correction can reduce the expected prediction error under zero-mean textual perturbations. We further conduct supplementary experiments with injected textual perturbations to examine the robustness of PA-RNet. The results across diverse domains demonstrate that PA-RNet consistently outperforms state-of-the-art baselines and maintains stable forecasting performance under both original and noise-perturbed textual conditions.