- Provide.ai - Page 61

Watermarking LLM Agent Trajectories

/ May 5, 2026

arXiv:2602.18700v2 Announce Type: replace-cross
Abstract: LLM agents rely heavily on high-quality trajectory data to guide their problem-solving behaviors, yet producing such data requires substantial task design, high-capacity model generation, and m…

cs.AI, cs.CL, cs.LG

VeRO: An Evaluation Harness for Agents to Optimize Agents

/ May 5, 2026

arXiv:2602.22480v2 Announce Type: replace
Abstract: An important emerging application of coding agents is agent optimization: the iterative improvement of a target agent through edit-execute-evaluate cycles. Despite its relevance, the community lacks …

cs.AI, cs.CV

Sparse Representation Learning for Vessels

/ May 5, 2026

arXiv:2605.01382v1 Announce Type: cross
Abstract: Analyzing human vasculature and vessel-like, tubular structures, such as airways, is crucial for disease diagnosis and treatment. Current methods often rely on small sub-regions or simplified tree-like…

cs.CV

VISTA: Video Interaction Spatio-Temporal Analysis Benchmark

/ May 5, 2026

arXiv:2605.01391v1 Announce Type: new
Abstract: Existing benchmarks for Vision-Language Models (VLMs) primarily evaluate spatio-temporal understanding on simple single-action videos, closed attribute sets and restricted entity types, failing to captur…

cs.CL, cs.IR

Led to Mislead: Adversarial Content Injection for Attacks on Neural Ranking Models

/ May 5, 2026

arXiv:2605.01591v1 Announce Type: cross
Abstract: Neural Ranking Models (NRMs) are central to modern information retrieval but remain highly vulnerable to adversarial manipulation. Existing attacks often rely on heuristics or surrogate models, limitin…

cs.CV

Act in Collusion: Distributed Multi-Target Backdoor Attacks in Federated Learning

/ May 5, 2026

arXiv:2411.03926v3 Announce Type: replace
Abstract: Federated learning (FL) is widely used in Internet-of-Things (IoT) systems, but its distributed training process also exposes it to backdoor attacks. Existing studies mainly consider single-target or…

cs.CL, cs.CV

Medical thinking with multiple images

/ May 5, 2026

arXiv:2604.16506v2 Announce Type: replace-cross
Abstract: Large language models perform well on many medical QA benchmarks, but real clinical reasoning often requires integrating evidence across multiple images rather than interpreting a single view. …

cs.CV

Registration-Free Learnable Multi-View Capture of Faces in Dense Semantic Correspondence

/ May 5, 2026

arXiv:2605.01450v1 Announce Type: new
Abstract: Recent frameworks like ToFu and TEMPEH provide an automated alternative to classical registration pipelines by predicting 3D meshes in dense semantic correspondence directly from calibrated multi-view im…

cs.AI, cs.RO

VILAS: A VLA-Integrated Low-cost Architecture with Soft Grasping for Robotic Manipulation

/ May 5, 2026

arXiv:2605.02037v1 Announce Type: cross
Abstract: We present VILAS, a fully low-cost, modular robotic manipulation platform designed to support end-to-end vision-language-action (VLA) policy learning and deployment on accessible hardware. The system i…

cs.CV, cs.RO

VoxAfford: Multi-Scale Voxel-Token Fusion for Open-Vocabulary 3D Affordance Detection

/ May 5, 2026

arXiv:2605.01365v1 Announce Type: cross
Abstract: Open-vocabulary 3D affordance detection requires localizing interaction regions on point clouds given novel affordance descriptions. Recent methods extend multimodal large language models (MLLMs) with …