Modular Retrieval-Augmented Generalization for Human Action Recognition

arXiv:2605.08117v1 Announce Type: cross Abstract: Inertial Measurement Unit (IMU)-based Human Activity Recognition (HAR) aims to interpret and classify user behaviors from temporal motion signals. Recently, deep learning frameworks have advanced this task by learning and extracting discriminative spatiotemporal representations, significantly improving recognition performance. However, IMU-based HAR still faces several critical challenges, particularly limited training samples and static knowledge utilization, both of which severely hinder its large-scale deployment. In this paper, we introduce MoRA, the first Retrieval-Augmented Module specifically designed for motion series. It can be flexibly integrated into any existing HAR model, enhancing recognition performance while maintaining inference efficiency. To address issues such as information redundancy in retrieval results and rigid fusion strategies, we propose an uncertainty-adaptive fusion unit within MoRA. This unit leverages previous physical knowledge from IMU signals to dynamically adjust the fusion strategy between original outputs and retrieved information, enabling more robust recognition. Extensive experiments on ten real-world datasets demonstrate that MoRA significantly improves the performance of existing IMU-based HAR models, consistently delivering stable and effective gains. The source code of MoRA is available at: https://github.com/liavonpenn/mora.

Leave a Comment