- Provide.ai - Page 45

Rethinking Network Topologies for Cost-Effective Mixture-of-Experts LLM Serving

/ May 5, 2026

arXiv:2605.00254v1 Announce Type: cross
Abstract: Mixture-of-experts (MoE) architectures have turned LLM serving into a cluster-scale workload in which communication consumes a considerable portion of LLM serving runtime. This has prompted industry to…

cs.CL, cs.CV

Medical thinking with multiple images

/ May 5, 2026

arXiv:2604.16506v2 Announce Type: replace-cross
Abstract: Large language models perform well on many medical QA benchmarks, but real clinical reasoning often requires integrating evidence across multiple images rather than interpreting a single view. …

cs.CV

From Where Things Are to What They Are For: Benchmarking Spatial-Functional Intelligence in Multimodal LLMs

/ May 5, 2026

arXiv:2605.02130v1 Announce Type: new
Abstract: Human-level agentic intelligence extends beyond low-level geometric perception, evolving from recognizing where things are to understanding what they are for. While existing benchmarks effectively evalua…

cs.DC, cs.LG

STAR: Decode-Phase Rescheduling for LLM Inference

/ May 5, 2026

arXiv:2510.13668v2 Announce Type: replace-cross
Abstract: Large Language Model (LLM) inference has emerged as a fundamental paradigm, however, variations in output length cause severe workload imbalance in the decode phase, particularly for long-outpu…

cs.DL, cs.LG

ARA: Agentic Reproducibility Assessment For Scalable Support Of Scientific Peer-Review

/ May 5, 2026

arXiv:2605.02651v1 Announce Type: cross
Abstract: Scientific peer review increasingly struggles to assess reproducibility at the scale and complexity of modern research output. Evaluating reproducibility requires reconstructing experimental dependenci…

cs.AI, cs.SE

Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World Workflows

/ May 5, 2026

arXiv:2604.28139v2 Announce Type: replace-cross
Abstract: LLM agents are expected to complete end-to-end units of work across software tools, business services, and local workspaces. Yet many agent benchmarks freeze a curated task set at release time …

cs.LG, cs.SY, eess.SY, quant-ph

From Characterization To Construction: Generative Quantum Circuit Synthesis from Gate Set Tomography Data

/ May 5, 2026

arXiv:2605.01367v1 Announce Type: cross
Abstract: High-fidelity circuit execution on noisy intermediate-scale quantum devices is bottlenecked by compilation pipelines that disregard complex, correlated noise. To address this, this methodology article …

cs.AI, cs.CL, cs.CV

Beyond SFT-to-RL: Pre-alignment via Black-Box On-Policy Distillation for Multimodal RL

/ May 5, 2026

arXiv:2604.28123v2 Announce Type: replace-cross
Abstract: The standard post-training recipe for large multimodal models (LMMs) applies supervised fine-tuning (SFT) on curated demonstrations followed by reinforcement learning with verifiable rewards (R…

cs.AI, cs.LG

Recurrent Deep Reinforcement Learning for Chemotherapy Control under Partial Observability

/ May 5, 2026

arXiv:2605.02552v1 Announce Type: cross
Abstract: Chemotherapy dose optimization can be formulated as a dynamic treatment regime, requiring sequential decisions under uncertainty that must balance tumor suppression against toxicity. However, most rein…

cs.CV

SpecEdit: Training-Free Acceleration for Diffusion based Image Editing via Semantic Locking

/ May 5, 2026

arXiv:2605.02152v1 Announce Type: new
Abstract: Diffusion-based image editing offers strong semantic controllability, but remains computationally expensive due to iterative high-resolution denoising over all spatial tokens. Dynamic-resolution sampling…