A Human-Inspired Decoupled Architecture for Efficient Audio Representation Learning
arXiv:2603.26098v1 Announce Type: cross
Abstract: While self-supervised learning (SSL) has revolutionized audio representation, the excessive parameterization and quadratic computational cost of standard Transformers limit their deployment on resource…