cs.CV

Enhancing Fine-Grained Spatial Grounding in 3D CT Report Generation via Discriminative Guidance

arXiv:2604.10437v1 Announce Type: new
Abstract: Vision–language models (VLMs) for radiology report generation (RRG) can produce long-form chest CT reports from volumetric scans and show strong potential to improve radiology workflow efficiency and co…