- Provide.ai - Page 525

Google targets AI inference bottlenecks with TurboQuant

/ March 26, 2026

Google says its new TurboQuant method could improve how efficiently AI models run by compressing the key-value cache used in LLM inference and supporting more efficient vector search.

In tests on Gemma and Mistral models, the …

cs.AI

GeoSketch: A Neural-Symbolic Approach to Geometric Multimodal Reasoning with Auxiliary Line Construction and Affine Transformation

/ March 26, 2026

arXiv:2509.22460v3 Announce Type: replace
Abstract: Geometric Problem Solving (GPS) poses a unique challenge for Multimodal Large Language Models (MLLMs), requiring not only the joint interpretation of text and diagrams but also iterative visuospatial…

cs.AI, cs.CV

Language Models Can Explain Visual Features via Steering

/ March 26, 2026

arXiv:2603.22593v2 Announce Type: replace-cross
Abstract: Sparse Autoencoders uncover thousands of features in vision models, yet explaining these features without requiring human intervention remains an open challenge. While previous work has propose…

cs.AI

Relationship-Aware Safety Unlearning for Multimodal LLMs

/ March 26, 2026

arXiv:2603.14185v3 Announce Type: replace
Abstract: Generative multimodal models can exhibit safety failures that are inherently relational: two benign concepts can become unsafe when linked by a specific action or relation (e.g., child-drinking-wine)…

cs.CV, cs.RO

TAG: Target-Agnostic Guidance for Stable Object-Centric Inference in Vision-Language-Action Models

/ March 26, 2026

arXiv:2603.24584v1 Announce Type: new
Abstract: Vision–Language–Action (VLA) policies have shown strong progress in mapping language instructions and visual observations to robotic actions, yet their reliability degrades in cluttered scenes with dis…

cs.CV

Distribution Matching Distillation Meets Reinforcement Learning

/ March 26, 2026

arXiv:2511.13649v4 Announce Type: replace
Abstract: Distribution Matching Distillation (DMD) facilitates efficient inference by distilling multi-step diffusion models into few-step variants. Concurrently, Reinforcement Learning (RL) has emerged as a v…

cs.CV

3D-LLDM: Label-Guided 3D Latent Diffusion Model for Improving High-Resolution Synthetic MR Imaging in Hepatic Structure Segmentation

/ March 26, 2026

arXiv:2603.23845v1 Announce Type: new
Abstract: Deep learning and generative models are advancing rapidly, with synthetic data increasingly being integrated into training pipelines for downstream analysis tasks. However, in medical imaging, their adop…

cs.CV

SERA-H: Beyond Native Sentinel Spatial Limits for High-Resolution Canopy Height Mapping

/ March 26, 2026

arXiv:2512.18128v3 Announce Type: replace
Abstract: High-resolution mapping of canopy height is essential for forest management and biodiversity monitoring. Although recent studies have led to the advent of deep learning methods using satellite imager…

cs.CV

MLE-UVAD: Minimal Latent Entropy Autoencoder for Fully Unsupervised Video Anomaly Detection

/ March 26, 2026

arXiv:2603.23868v1 Announce Type: new
Abstract: In this paper, we address the challenging problem of single-scene, fully unsupervised video anomaly detection (VAD), where raw videos containing both normal and abnormal events are used directly for trai…

cs.CV

Dehallu3D: Hallucination-Mitigated 3D Generation from Single Image via Cyclic View Consistency Refinement

/ March 26, 2026

arXiv:2603.01601v2 Announce Type: replace
Abstract: Large 3D reconstruction models have revolutionized the 3D content generation field, enabling broad applications in virtual reality and gaming. Just like other large models, large 3D reconstruction mo…