Rethinking Information Synthesis in Multimodal Question Answering A Multi-Agent Perspective
arXiv:2505.20816v2 Announce Type: replace
Abstract: Recent advances in multimodal question answering have primarily focused on combining heterogeneous modalities or fine-tuning multimodal large language models. While these approaches have shown strong…