cs.AI, cs.CL, cs.HC, cs.MM, cs.SD, eess.AS

MIST: Multimodal Interactive Speech-based Tool-calling Conversational Assistants for Smart Homes

arXiv:2605.06897v1 Announce Type: new
Abstract: The rise of Internet of Things (IoT) devices in the physical world necessitates voice-based interfaces capable of handling complex user experiences. While modern Large Language Models (LLMs) already demo…