- Provide.ai - Page 90

AgentCollabBench: Diagnosing When Good Agents Make Bad Collaborators

/ May 12, 2026

arXiv:2605.08647v1 Announce Type: cross
Abstract: Multi-agent systems achieve state-of-the-art outcomes through peer collaboration. However, when an agent in the pipeline silently drops a constraint, the system’s final output may look correct even tho…

cs.CV

Qwen-Image-2.0 Technical Report

/ May 12, 2026

arXiv:2605.10730v1 Announce Type: new
Abstract: We present Qwen-Image-2.0, an omni-capable image generation foundation model that unifies high-fidelity generation and precise image editing within a single framework. Despite recent progress, existing m…

cs.AI

Towards Autonomous Railway Operations: A Semi-Hierarchical Deep Reinforcement Learning Approach to the Vehicle Rescheduling Problem

/ May 12, 2026

arXiv:2605.10257v1 Announce Type: new
Abstract: Managing disruptions in railway traffic management is a major challenge. Rising traffic density and infrastructure limits increase complexity, making the Vehicle Routing and Scheduling Problem (VRSP) dif…

cs.CL

Hint Tuning: Less Data Makes Better Reasoners

/ May 12, 2026

arXiv:2605.08665v1 Announce Type: new
Abstract: Large reasoning models achieve high accuracy through extended chain-of-thought but generate 5–8 more tokens than necessary, applying verbose reasoning uniformly regardless of problem difficulty. We prop…

cs.AI, cs.LG

E-TCAV: Formalizing Penultimate Proxies for Efficient Concept Based Interpretability

/ May 12, 2026

arXiv:2605.10261v1 Announce Type: new
Abstract: TCAV (Testing with Concept Activation Vectors) is an interpretability method that assesses the alignment between the internal representations of a trained neural network and human-understandable, high-le…

cs.CE, cs.CL, cs.CV, cs.LG

SpatiaLab: Can Vision-Language Models Perform Spatial Reasoning in the Wild?

/ May 12, 2026

arXiv:2602.03916v3 Announce Type: replace-cross
Abstract: Spatial reasoning is a fundamental aspect of human cognition, yet it remains a major challenge for contemporary vision-language models (VLMs). Prior work largely relied on synthetic or LLM-gene…

cs.CV, cs.GR, cs.MM, cs.SD

Unison: Harmonizing Motion, Speech, and Sound for Human-Centric Audio-Video Generation

/ May 12, 2026

arXiv:2605.08729v1 Announce Type: new
Abstract: Motion, speech, and sound effects are fundamental elements of human-centric videos, yet their heterogeneous temporal characteristics make joint generation highly challenging. Existing audio-video generat…

cs.AI

IndustryBench: Probing the Industrial Knowledge Boundaries of LLMs

/ May 12, 2026

arXiv:2605.10267v1 Announce Type: new
Abstract: In industrial procurement, an LLM answer is useful only if it survives a standards check: recommended material must match operating condition, every parameter must respect a regulated threshold, and no p…

cs.LG

What Time Is It? How Data Geometry Makes Time Conditioning Optional for Flow Matching

/ May 12, 2026

arXiv:2605.08344v1 Announce Type: new
Abstract: Recent work has shown that models flow matching models can be trained without explicit time conditioning, challenging the standard view that the interpolation time is needed to disambiguate velocity targ…

cs.CL, cs.CV, cs.RO

Xiaomi OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation

/ May 12, 2026

arXiv:2604.18486v3 Announce Type: replace-cross
Abstract: Chain-of-Thought (CoT) reasoning has become a powerful driver of trajectory prediction in VLA-based autonomous driving, yet its autoregressive nature imposes a latency cost that is prohibitive …