ViDR: Grounding Multimodal Deep Research Reports in Source Visual Evidence
arXiv:2605.13034v1 Announce Type: new
Abstract: Recent deep research systems have improved the ability of large language models to produce long, grounded reports through iterative retrieval and reasoning. However, most text-centered systems rely mainl…