- Provide.ai - Page 423

Towards Reliable Human Evaluations in Gesture Generation: Insights from a Community-Driven State-of-the-Art Benchmark

/ April 22, 2026

arXiv:2511.01233v3 Announce Type: replace
Abstract: We review human evaluation practices in automatic, speech-driven 3D gesture generation and find a lack of standardisation and frequent use of flawed experimental setups. This leads to a situation whe…

cs.AI, cs.CL

CulturALL: Benchmarking Multilingual and Multicultural Competence of LLMs on Grounded Tasks

/ April 22, 2026

arXiv:2604.19262v1 Announce Type: cross
Abstract: Large language models (LLMs) are now deployed worldwide, inspiring a surge of benchmarks that measure their multilingual and multicultural abilities. However, these benchmarks prioritize generic langua…

cs.AI, cs.LG

Diamond Maps: Efficient Reward Alignment via Stochastic Flow Maps

/ April 22, 2026

arXiv:2602.05993v2 Announce Type: replace
Abstract: Flow and diffusion models produce high-quality samples, but adapting them to user preferences or constraints post-training remains costly and brittle, a challenge commonly called reward alignment. We…

cs.AI

GRAIL:Learning to Interact with Large Knowledge Graphs for Retrieval Augmented Reasoning

/ April 22, 2026

arXiv:2508.05498v2 Announce Type: replace
Abstract: Large Language Models (LLMs) integrated with Retrieval-Augmented Generation (RAG) techniques have exhibited remarkable performance across a wide range of domains. However, existing RAG approaches pri…

cs.CL

Can Continual Pre-training Bridge the Performance Gap between General-purpose and Specialized Language Models in the Medical Domain?

/ April 22, 2026

arXiv:2604.19394v1 Announce Type: new
Abstract: This paper narrows the performance gap between small, specialized models and significantly larger general-purpose models through domain adaptation via continual pre-training and merging. We address the s…

cs.RO

GenerativeMPC: VLM-RAG-guided Whole-Body MPC with Virtual Impedance for Bimanual Mobile Manipulation

/ April 22, 2026

arXiv:2604.19522v1 Announce Type: new
Abstract: Bimanual mobile manipulation requires a seamless integration between high-level semantic reasoning and safe, compliant physical interaction – a challenge that end-to-end models approach opaquely and clas…

cs.AI, cs.CL

Location Not Found: Exposing Implicit Local and Global Biases in Multilingual LLMs

/ April 22, 2026

arXiv:2604.19292v1 Announce Type: cross
Abstract: Multilingual large language models (LLMs) have minimized the fluency gap between languages. This advancement, however, exposes models to the risk of biased behavior, as knowledge and norms may propagat…

cs.CV

BALTIC: A Benchmark and Cross-Domain Strategy for 3D Reconstruction Across Air and Underwater Domains Under Varying Illumination

/ April 22, 2026

arXiv:2604.19133v1 Announce Type: new
Abstract: Robust 3D reconstruction across varying environmental conditions remains a critical challenge for robotic perception, particularly when transitioning between air and water. To address this, we introduce …

cs.RO

LiveVLN: Breaking the Stop-and-Go Loop in Vision-Language Navigation

/ April 22, 2026

arXiv:2604.19536v1 Announce Type: new
Abstract: Recent navigation systems achieve strong benchmark results, yet real-world deployment often remains visibly stop-and-go. This bottleneck arises because the sense-inference-execution loop is still blockin…

cs.CL

Lost in Translation: Do LVLM Judges Generalize Across Languages?

/ April 22, 2026

arXiv:2604.19405v1 Announce Type: new
Abstract: Automatic evaluators such as reward models play a central role in the alignment and evaluation of large vision-language models (LVLMs). Despite their growing importance, these evaluators are almost exclu…