cs.CV, cs.LG

FRISM: Fine-Grained Reasoning Injection via Subspace-Level Model Merging for Vision-Language Models

arXiv:2601.21187v2 Announce Type: replace-cross
Abstract: Efficiently enhancing the reasoning capabilities of Vision-Language Models (VLMs) by merging them with Large Reasoning Models (LRMs) has emerged as a promising direction. However, existing meth…