- Provide.ai - Page 121

Mobile-R1: Towards Interactive Capability for VLM-Based Mobile Agent via Systematic Training

/ April 28, 2026

arXiv:2506.20332v4 Announce Type: replace
Abstract: Vision-language model-based mobile agents have gained the ability to understand complex instructions and mobile screenshots, benefiting from reinforcement learning paradigms like Group Relative Polic…

cs.CV

Graph-augmented Segmentation of Complex Shapes in Laser Powder bed Fusion for Enhanced In Situ Inspection

/ April 28, 2026

arXiv:2604.24234v1 Announce Type: new
Abstract: The technological maturity of in situ inspection and monitoring methods in additive manufacturing is steadily increasing, enabling more efficient and practical qualification procedures. In this context, …

cs.AI, cs.DC, cs.IR, cs.LG

FreeScale: Distributed Training for Sequence Recommendation Models with Minimal Scaling Cost

/ April 28, 2026

arXiv:2604.24073v1 Announce Type: cross
Abstract: Modern industrial Deep Learning Recommendation Models typically extract user preferences through the analysis of sequential interaction histories, subsequently generating predictions based on these der…

cs.AI, cs.CV

FOCUS: Fused Observation of Channels for Unveiling Spectra

/ April 28, 2026

arXiv:2507.14787v2 Announce Type: replace-cross
Abstract: Hyperspectral imaging (HSI) captures hundreds of narrow, contiguous wavelength bands, making it a powerful tool in biology, agriculture, and environmental monitoring. However, interpreting Visi…

cs.AI, cs.CL, cs.LG

Quantifying and Improving the Robustness of Retrieval-Augmented Language Models Against Spurious Features in Grounding Data

/ April 28, 2026

arXiv:2503.05587v3 Announce Type: replace-cross
Abstract: Robustness has become a critical attribute for the deployment of RAG systems in real-world applications. Existing research focuses on robustness to explicit noise (e.g., document semantics) but…

cs.AI

Vibe Medicine: Redefining Biomedical Research Through Human-AI Co-Work

/ April 28, 2026

arXiv:2604.23674v1 Announce Type: new
Abstract: With the emergence of large language models (LLMs) and AI agent frameworks, the human-AI co-work paradigm known as Vibe Coding is changing how people code, making it more accessible and productive. In sc…

cs.CV

Touchless Intraoperative Image Access System Based on Vision-Based Hand Tracking

/ April 28, 2026

arXiv:2604.24235v1 Announce Type: new
Abstract: Touchless interaction with medical images is becoming increasingly important in the surgical field, where sterility and continuity of the operational workflow are essential requirements. This work presen…

cs.AI, cs.CL, cs.CV

Agri-CPJ: A Training-Free Explainable Framework for Agricultural Pest Diagnosis Using Caption-Prompt-Judge and LLM-as-a-Judge

/ April 28, 2026

arXiv:2604.23701v1 Announce Type: cross
Abstract: Crop disease diagnosis from field photographs faces two recurring problems: models that score well on benchmarks frequently hallucinate species names, and when predictions are correct, the reasoning be…

cs.RO

QuietWalk: Physics-Informed Reinforcement Learning for Ground Reaction Force-Aware Humanoid Locomotion Under Diverse Footwear

/ April 28, 2026

arXiv:2604.23702v1 Announce Type: new
Abstract: Humanoid robots operating in human-centered environments (e.g., homes, hospitals, and offices) must mitigate foot–ground impact transients, as impact-induced vibration and noise degrade user experience …

cs.CV

Instance Awareness of Multi-class Semantic Segmentation Loss Functions

/ April 28, 2026

arXiv:2604.24276v1 Announce Type: new
Abstract: Instance-sensitive losses for semantic segmentation such as blob loss and CC loss were designed to address instance imbalance, ensuring small lesions generate the same gradient as large ones, but operate…