Hydra: Unifying Document Retrieval and Generation in a Single Vision-Language Model
arXiv:2603.28554v2 Announce Type: replace-cross
Abstract: Visual document understanding typically requires separate retrieval and generation models, doubling memory and system complexity. We present Hydra, a dual-head approach that provides both ColBE…