Zoom Consistency: A Free Confidence Signal in Multi-Step Visual Grounding Pipelines
arXiv:2604.15376v1 Announce Type: cross
Abstract: Multi-step zoom-in pipelines are widely used for GUI grounding, yet the intermediate predictions they produce are typically discarded after coordinate remapping. We observe that these intermediate outp…