cs.CL

Guided Query Refinement: Multimodal Hybrid Retrieval with Test-Time Optimization

arXiv:2510.05038v3 Announce Type: replace
Abstract: Multimodal encoders have pushed the boundaries of visual document retrieval, matching textual query tokens directly to image patches and achieving state-of-the-art performance on public benchmarks. R…