- Provide.ai - Page 458

StreamingVLA: Streaming Vision-Language-Action Model with Action Flow Matching and Adaptive Early Observation

/ March 31, 2026

arXiv:2603.28565v1 Announce Type: new
Abstract: Vision-language-action (VLA) models have demonstrated exceptional performance in natural language-driven perception and control. However, the high computational cost of VLA models poses significant effic…

cs.AI, cs.RO

Ruka-v2: Tendon Driven Open-Source Dexterous Hand with Wrist and Abduction for Robot Learning

/ March 31, 2026

arXiv:2603.26660v2 Announce Type: replace
Abstract: Lack of accessible and dexterous robot hardware has been a significant bottleneck to achieving human-level dexterity in robots. Last year, we released Ruka, a fully open-sourced, tendon-driven humano…

cs.AI, cs.CV, cs.RO

Securing the Skies: A Comprehensive Survey on Anti-UAV Methods, Benchmarking, and Future Directions

/ March 31, 2026

arXiv:2504.11967v4 Announce Type: replace-cross
Abstract: Unmanned Aerial Vehicles (UAVs) are indispensable for infrastructure inspection, surveillance, and related tasks, yet they also introduce critical security challenges. This survey provides a wi…

cs.CV, cs.GR, cs.LG, cs.RO

SimULi: Real-Time LiDAR and Camera Simulation with Unscented Transforms

/ March 31, 2026

arXiv:2510.12901v3 Announce Type: replace-cross
Abstract: Rigorous testing of autonomous robots, such as self-driving vehicles, is essential to ensure their safety in real-world deployments. This requires building high-fidelity simulators to test scen…

cs.RO

DRIVE-Nav: Directional Reasoning, Inspection, and Verification for Efficient Open-Vocabulary Navigation

/ March 31, 2026

arXiv:2603.28691v1 Announce Type: new
Abstract: Open-Vocabulary Object Navigation (OVON) requires an embodied agent to locate a language-specified target in unknown environments. Existing zero-shot methods often reason over dense frontier points under…

cs.AI, cs.CV, cs.LG, cs.MM, cs.RO

Scaling Spatial Intelligence with Multimodal Foundation Models

/ March 31, 2026

arXiv:2511.13719v4 Announce Type: replace-cross
Abstract: Despite remarkable progress, multimodal foundation models still exhibit surprising deficiencies in spatial intelligence. In this work, we explore scaling up multimodal foundation models to cult…

cs.RO, cs.SY, eess.SY

LLM-Enabled Low-Altitude UAV Natural Language Navigation via Signal Temporal Logic Specification Translation and Repair

/ March 31, 2026

arXiv:2603.27583v1 Announce Type: new
Abstract: Natural language (NL) navigation for low-altitude unmanned aerial vehicles (UAVs) offers an intelligent and convenient solution for low-altitude aerial services by enabling an intuitive interface for non…

cs.LG, cs.RO

DADP: Domain Adaptive Diffusion Policy

/ March 31, 2026

arXiv:2602.04037v2 Announce Type: replace-cross
Abstract: Learning domain adaptive policies that can generalize to unseen transition dynamics, remains a fundamental challenge in learning-based control. Substantial progress has been made through domain…

cs.AI, cs.RO

Heracles: Bridging Precise Tracking and Generative Synthesis for General Humanoid Control

/ March 31, 2026

arXiv:2603.27756v1 Announce Type: new
Abstract: Achieving general-purpose humanoid control requires a delicate balance between the precise execution of commanded motions and the flexible, anthropomorphic adaptability needed to recover from unpredictab…

cs.AI, cs.CV, cs.RO

Language-Conditioned World Modeling for Visual Navigation

/ March 31, 2026

arXiv:2603.26741v1 Announce Type: cross
Abstract: We study language-conditioned visual navigation (LCVN), in which an embodied agent is asked to follow a natural language instruction based only on an initial egocentric observation. Without access to g…