LinguDistill: Recovering Linguistic Ability in Vision-Language Models via Selective Cross-Modal Distillation
arXiv:2604.00829v3 Announce Type: replace-cross
Abstract: Adapting pretrained language models (LMs) into vision-language models (VLMs) can degrade their native linguistic capability due to representation shift and cross-modal interference introduced d…