VLN-Cache: Enabling Token Caching for VLN Models with Visual/Semantic Dynamics Awareness
arXiv:2603.07080v3 Announce Type: replace
Abstract: Vision-and-Language Navigation (VLN) increasingly relies on large vision-language models, but their inference cost conflicts with real-time deployment. Token caching is a promising training-free stra…