Count Anything at Any Granularity
arXiv:2605.10887v1 Announce Type: new
Abstract: Open-world object counting remains brittle: despite rapid advances in vision-language models (VLMs), reliably counting the objects a user intends is far from solved. We argue that a central reason is tha…