COMODO: Cross-Modal Video-to-IMU Distillation for Efficient Egocentric Human Activity Recognition
arXiv:2503.07259v2 Announce Type: replace-cross
Abstract: The goal of creating intelligent, human-centered wearable systems for continuous activity understanding faces a fundamental trade-off: Egocentric video-based models capture rich semantic inform…