- Provide.ai - Page 5

XR-1: Towards Versatile Vision-Language-Action Models via Learning Unified Vision-Motion Representations

/ May 15, 2026

arXiv:2511.02776v2 Announce Type: replace
Abstract: Recent progress in large-scale robotic datasets and vision-language models (VLMs) has advanced research on vision-language-action (VLA) models. However, existing VLA models still face two fundamental…

cs.LG, cs.RO, cs.SY, eess.SY

CoCo-InEKF: State Estimation with Learned Contact Covariances in Dynamic, Contact-Rich Scenarios

/ May 15, 2026

arXiv:2605.15122v1 Announce Type: new
Abstract: Robust state estimation for highly dynamic motion of legged robots remains challenging, especially in dynamic, contact-rich scenarios. Traditional approaches often rely on binary contact states that fail…

cs.AI, cs.LG, cs.RO

Robometer: Scaling General-Purpose Robotic Reward Models via Trajectory Comparisons

/ May 15, 2026

arXiv:2603.02115v2 Announce Type: replace-cross
Abstract: General-purpose robot reward models are typically trained to predict absolute task progress from expert demonstrations, providing only local, frame-level supervision. While effective for expert…

cs.AI, cs.RO

Pelican-Unified 1.0: A Unified Embodied Intelligence Model for Understanding, Reasoning, Imagination and Action

/ May 15, 2026

arXiv:2605.15153v1 Announce Type: cross
Abstract: We present Pelican-Unified 1.0, the first embodied foundation model trained according to the principle of unification. Pelican-Unified 1.0 uses a single VLM as a unified understanding module, mapping s…

cs.CV, cs.RO

Driving Intents Amplify Planning-Oriented Reinforcement Learning

/ May 15, 2026

arXiv:2605.12625v2 Announce Type: replace
Abstract: Continuous-action policies trained on a single demonstrated trajectory per scene suffer from mode collapse: samples cluster around the demonstrated maneuver and the policy cannot represent semantical…

cs.CR, cs.CV, cs.LG, cs.RO

Systematic Discovery of Semantic Attacks in Online Map Construction through Conditional Diffusion

/ May 15, 2026

arXiv:2605.14396v1 Announce Type: cross
Abstract: Autonomous vehicles depend on online HD map construction to perceive lane boundaries, dividers, and pedestrian crossings — safety-critical road elements that directly govern motion planning. While exi…

cs.CV, cs.RO

Co-Me: Confidence-Guided Token Merging for Visual Geometric Transformers

/ May 15, 2026

arXiv:2511.14751v2 Announce Type: replace-cross
Abstract: We propose Confidence-Guided Token Merging (Co-Me), an acceleration mechanism for visual geometric transformers without retraining or finetuning the base model. Co-Me distilled a light-weight c…

cs.DC, cs.GR, cs.NA, cs.RO, math.NA

DiffPhD: A Unified Differentiable Solver for Projective Heterogeneous Materials in Elastodynamics with Contact-Rich GPU-Acceleration

/ May 15, 2026

arXiv:2605.14526v1 Announce Type: cross
Abstract: Differentiable simulation of soft bodies is a foundation for system identification, trajectory optimization, and Real2Sim transfer. Yet, existing methods such as the differentiable Projective Dynamics …

cs.AI, cs.RO

D-VLA: A High-Concurrency Distributed Asynchronous Reinforcement Learning Framework for Vision-Language-Action Models

/ May 15, 2026

arXiv:2605.13276v2 Announce Type: replace
Abstract: The rapid evolution of Embodied AI has enabled Vision-Language-Action (VLA) models to excel in multimodal perception and task execution. However, applying Reinforcement Learning (RL) to these massive…

cs.CL, cs.LG, stat.ML

Knowing When to Quit: A Principled Framework for Dynamic Abstention in LLM Reasoning

/ May 15, 2026

arXiv:2604.18419v3 Announce Type: replace-cross
Abstract: LLMs utilizing chain-of-thought reasoning often waste substantial compute by producing long, incorrect responses. Abstention can mitigate this by withholding outputs unlikely to be correct. Whi…