- Provide.ai - Page 22

LLMs Gaming Verifiers: RLVR can Lead to Reward Hacking

/ April 17, 2026

arXiv:2604.15149v1 Announce Type: new
Abstract: As reinforcement Learning with Verifiable Rewards (RLVR) has become the dominant paradigm for scaling reasoning capabilities in LLMs, a new failure mode emerges: LLMs gaming verifiers. We study this phen…

cs.AI, cs.LG, cs.SD, eess.AS, eess.SP

Gaussian Process Regression of Steering Vectors With Physics-Aware Deep Composite Kernels for Augmented Listening

/ April 17, 2026

arXiv:2509.02571v2 Announce Type: replace-cross
Abstract: This paper investigates continuous representations of steering vectors over frequency and microphone/source positions for augmented listening (e.g., spatial filtering and binaural rendering), e…

cs.LG

TOPCELL: Topology Optimization of Standard Cell via LLMs

/ April 17, 2026

arXiv:2604.14237v1 Announce Type: new
Abstract: Transistor topology optimization is a critical step in standard cell design, directly dictating diffusion sharing efficiency and downstream routability. However, identifying optimal topologies remains a …

cs.CV, cs.LG

Beyond Independent Frames: Latent Attention Masked Autoencoders for Multi-View Echocardiography

/ April 17, 2026

arXiv:2604.15096v1 Announce Type: cross
Abstract: Echocardiography is a widely used modality for cardiac assessment due to its non-invasive and cost-effective nature, but the sparse and heterogeneous spatiotemporal views of the heart pose distinct cha…

cs.CL, cs.LG

AdaSplash-2: Faster Differentiable Sparse Attention

/ April 17, 2026

arXiv:2604.15180v1 Announce Type: new
Abstract: Sparse attention has been proposed as a way to alleviate the quadratic cost of transformers, a central bottleneck in long-context training. A promising line of work is $\alpha$-entmax attention, a differ…

cs.LG

RL-STPA: Adapting System-Theoretic Hazard Analysis for Safety-Critical Reinforcement Learning

/ April 17, 2026

arXiv:2604.15201v1 Announce Type: new
Abstract: As reinforcement learning (RL) deployments expand into safety-critical domains, existing evaluation methods fail to systematically identify hazards arising from the black-box nature of neural network ena…

cs.AI

AI-Assisted Peer Review at Scale: The AAAI-26 AI Review Pilot

/ April 17, 2026

arXiv:2604.13940v1 Announce Type: new
Abstract: Scientific peer review faces mounting strain as submission volumes surge, making it increasingly difficult to sustain review quality, consistency, and timeliness. Recent advances in AI have led the commu…

cs.AI, cs.LG

Awakening Dormant Experts:Counterfactual Routing to Mitigate MoE Hallucinations

/ April 17, 2026

arXiv:2604.14246v1 Announce Type: new
Abstract: Sparse Mixture-of-Experts (MoE) models have achieved remarkable scalability, yet they remain vulnerable to hallucinations, particularly when processing long-tail knowledge. We identify that this fragilit…

cs.LG, eess.SP

Survey of Deep Learning and Physics-Based Approaches in Computational Wave Imaging

/ April 17, 2026

arXiv:2410.08329v3 Announce Type: replace
Abstract: Computational wave imaging (CWI) extracts hidden structure and physical properties of a volume of material by analyzing wave signals that traverse that volume. Applications include seismic exploratio…

cs.LG

Calibrate-Then-Delegate: Safety Monitoring with Risk and Budget Guarantees via Model Cascades

/ April 17, 2026

arXiv:2604.14251v1 Announce Type: new
Abstract: Monitoring LLM safety at scale requires balancing cost and accuracy: a cheap latent-space probe can screen every input, but hard cases should be escalated to a more expensive expert. Existing cascades de…