Omni-o3: Deep Nested Omnimodal Deduction for Deliberative Audio-Visual Reasoning
arXiv:2604.24191v1 Announce Type: new
Abstract: Omnimodal understanding entails a massive, highly redundant search space of cross-modal interactions, demanding focused and deliberative reasoning. Current reasoning paradigms rely on either sequential s…