- Provide.ai - Page 11

OMHBench: Benchmarking Balanced and Grounded Omni-Modal Multi-Hop Reasoning

/ April 29, 2026

arXiv:2508.16198v3 Announce Type: replace
Abstract: Multimodal Large Language Models (MLLMs) have increasingly supported omni-modal processing across text, vision, and speech. However, existing evaluation frameworks for such models suffer from critica…

cs.RO, eess.AS

ASAP: An Azimuth-Priority Strip-Based Search Approach to Planar Microphone Array DOA Estimation in 3D

/ April 29, 2026

arXiv:2604.25387v1 Announce Type: cross
Abstract: Direction-of-arrival (DOA) estimation is an important task in microphone array processing and many downstream applications. The steered response power with phase transform (SRP-PHAT) method has been wi…

cs.AI, cs.CL

Analyzing LLM Reasoning to Uncover Mental Health Stigma

/ April 29, 2026

arXiv:2604.25053v1 Announce Type: new
Abstract: While large language models (LLMs) are increasingly being explored for mental health applications, recent studies reveal that they can exhibit stigma toward individuals with psychological conditions. Exi…

cs.CL, cs.CV

Toward Multimodal Conversational AI for Age-Related Macular Degeneration

/ April 29, 2026

arXiv:2604.25720v1 Announce Type: cross
Abstract: Despite strong performance of deep learning models in retinal disease detection, most systems produce static predictions without clinical reasoning or interactive explanation. Recent advances in multim…

cs.LG, cs.SE

FGDM: Reasoning Aware Multi-Agentic Framework for Software Bug Detection using Chain of Thought and Tree of Thought Prompting

/ April 29, 2026

arXiv:2604.24831v1 Announce Type: cross
Abstract: Deep Learning methods are becoming prominent in automated software bug detection; however, they lack the global understanding of the given code. Consequently, their performance tends to degrade, especi…

cs.AI, cs.CV, cs.RO

From Scene to Object: Text-Guided Dual-Gaze Prediction

/ April 29, 2026

arXiv:2604.20191v2 Announce Type: replace-cross
Abstract: Interpretable driver attention prediction is crucial for human-like autonomous driving. However, existing datasets provide only scene-level global gaze rather than fine-grained object-level ann…

cs.CL, cs.HC

The Dynamics of Delusion: Modeling Bidirectional False Belief Amplification in Human-Chatbot Dialogue

/ April 29, 2026

arXiv:2604.25096v1 Announce Type: new
Abstract: There is growing concern that AI chatbots might fuel delusional beliefs in users. Some have suggested that humans and chatbots mutually reinforce false beliefs over time, but quantitative evidence is lac…

cs.AI, cs.GR, cs.LG, cs.RO

MotionBricks: Scalable Real-Time Motions with Modular Latent Generative Model and Smart Primitives

/ April 29, 2026

arXiv:2604.24833v1 Announce Type: cross
Abstract: Despite transformative advances in generative motion synthesis, real-time interactive motion control remains dominated by traditional techniques. In this work, we identify two key challenges in bridgin…

cs.CL

jina-embeddings-v5-text: Task-Targeted Embedding Distillation

/ April 29, 2026

arXiv:2602.15547v2 Announce Type: replace
Abstract: Text embedding models are widely used for semantic similarity tasks, including information retrieval, clustering, and classification. General-purpose models are typically trained with single- or mult…

cs.AI, cs.LG

Time-varying Interaction Graph ODE for Dynamic Graph Representation Learning

/ April 29, 2026

arXiv:2604.24811v1 Announce Type: new
Abstract: Graph neural Ordinary Differential Equations (ODE) combine neural ODE with the message passing mechanism of Graph Neural Networks (GNN), providing a continuous-time modeling method for graph representati…