- Provide.ai - Page 23

Comparison of Modern Multilingual Text Embedding Techniques for Hate Speech Detection Task

/ April 17, 2026

arXiv:2604.14907v1 Announce Type: cross
Abstract: Online hate speech and abusive language pose a growing challenge for content moderation, especially in multilingual settings and for low-resource languages such as Lithuanian. This paper investigates t…

cs.AI

Towards Scalable Lightweight GUI Agents via Multi-role Orchestration

/ April 17, 2026

arXiv:2604.13488v1 Announce Type: new
Abstract: Autonomous Graphical User Interface (GUI) agents powered by Multimodal Large Language Models (MLLMs) enable digital automation on end-user devices. While scaling both parameters and data has yielded subs…

cs.AI, cs.LG

LLMs Gaming Verifiers: RLVR can Lead to Reward Hacking

/ April 17, 2026

arXiv:2604.15149v1 Announce Type: new
Abstract: As reinforcement Learning with Verifiable Rewards (RLVR) has become the dominant paradigm for scaling reasoning capabilities in LLMs, a new failure mode emerges: LLMs gaming verifiers. We study this phen…

cs.CV, cs.LG

Beyond Independent Frames: Latent Attention Masked Autoencoders for Multi-View Echocardiography

/ April 17, 2026

arXiv:2604.15096v1 Announce Type: cross
Abstract: Echocardiography is a widely used modality for cardiac assessment due to its non-invasive and cost-effective nature, but the sparse and heterogeneous spatiotemporal views of the heart pose distinct cha…

cs.CL, cs.LG

AdaSplash-2: Faster Differentiable Sparse Attention

/ April 17, 2026

arXiv:2604.15180v1 Announce Type: new
Abstract: Sparse attention has been proposed as a way to alleviate the quadratic cost of transformers, a central bottleneck in long-context training. A promising line of work is $\alpha$-entmax attention, a differ…

cs.LG

RL-STPA: Adapting System-Theoretic Hazard Analysis for Safety-Critical Reinforcement Learning

/ April 17, 2026

arXiv:2604.15201v1 Announce Type: new
Abstract: As reinforcement learning (RL) deployments expand into safety-critical domains, existing evaluation methods fail to systematically identify hazards arising from the black-box nature of neural network ena…

cs.AI

GeoAgentBench: A Dynamic Execution Benchmark for Tool-Augmented Agents in Spatial Analysis

/ April 17, 2026

arXiv:2604.13888v1 Announce Type: new
Abstract: The integration of Large Language Models (LLMs) into Geographic Information Systems (GIS) marks a paradigm shift toward autonomous spatial analysis. However, evaluating these LLM-based agents remains cha…

cs.AI, cs.CR

SafeHarness: Lifecycle-Integrated Security Architecture for LLM-based Agent Deployment

/ April 17, 2026

arXiv:2604.13630v1 Announce Type: cross
Abstract: The performance of large language model (LLM) agents depends critically on the execution harness, the system layer that orchestrates tool use, context management, and state persistence. Yet this same a…

cs.AI

AI-Assisted Peer Review at Scale: The AAAI-26 AI Review Pilot

/ April 17, 2026

arXiv:2604.13940v1 Announce Type: new
Abstract: Scientific peer review faces mounting strain as submission volumes surge, making it increasingly difficult to sustain review quality, consistency, and timeliness. Recent advances in AI have led the commu…

cs.LG

Calibrate-Then-Delegate: Safety Monitoring with Risk and Budget Guarantees via Model Cascades

/ April 17, 2026

arXiv:2604.14251v1 Announce Type: new
Abstract: Monitoring LLM safety at scale requires balancing cost and accuracy: a cheap latent-space probe can screen every input, but hard cases should be escalated to a more expensive expert. Existing cascades de…