Jonggeun Lee, Junseong Pyo, Gyuhyeon Seo, Yohan Jo

SpeakerSleuth: Can Large Audio-Language Models Judge Speaker Consistency across Multi-turn Dialogues?

Jonggeun Lee, Junseong Pyo, Gyuhyeon Seo, Yohan Jo / April 21, 2026

arXiv:2601.04029v2 Announce Type: replace
Abstract: Large Audio-Language Models (LALMs) as judges have emerged as a prominent approach for evaluating speech generation quality, yet their ability to assess speaker consistency across multi-turn dialogue…

Author name: Jonggeun Lee, Junseong Pyo, Gyuhyeon Seo, Yohan Jo

SpeakerSleuth: Can Large Audio-Language Models Judge Speaker Consistency across Multi-turn Dialogues?