cs.CL

MNAFT: modality neuron-aware fine-tuning of multimodal large language models for image translation

arXiv:2604.16943v1 Announce Type: new
Abstract: Multimodal large language models (MLLMs) have shown impressive capabilities, yet they often struggle to effectively capture the fine-grained textual information within images crucial for accurate image t…