- Provide.ai - Page 468

LiteCache: A Query Similarity-Driven, GPU-Centric KVCache Subsystem for Efficient LLM Inference

/ March 30, 2026

arXiv:2511.14510v2 Announce Type: replace
Abstract: During LLM inference, KVCache memory usage grows linearly with sequence length and batch size and often exceeds GPU capacity. Recent proposals offload KV states to host memory and reduce transfers us…

cs.LG

Extending Puzzle for Mixture-of-Experts Reasoning Models with Application to GPT-OSS Acceleration

/ March 30, 2026

arXiv:2602.11937v2 Announce Type: replace
Abstract: Reasoning-focused LLMs improve answer quality by generating longer reasoning traces, but the additional tokens dramatically increase serving cost, motivating inference optimization. We extend and app…

cs.CR, cs.LG

A Channel-Triggered Backdoor Attack on Wireless Semantic Image Reconstruction

/ March 30, 2026

arXiv:2503.23866v3 Announce Type: replace-cross
Abstract: This paper investigates backdoor attacks in image-oriented semantic communications. The threat of backdoor attacks on symbol reconstruction in semantic communication (SemCom) systems has receiv…

cs.IR, cs.LG

Defending Against Knowledge Poisoning Attacks During Retrieval-Augmented Generation

/ March 30, 2026

arXiv:2508.02835v2 Announce Type: replace
Abstract: Retrieval-Augmented Generation (RAG) has emerged as a powerful approach to boost the capabilities of large language models (LLMs) by incorporating external, up-to-date knowledge sources. However, thi…

cs.LG

Large Language Models Can Perform Automatic Modulation Classification via Discretized Self-supervised Candidate Retrieval

/ March 30, 2026

arXiv:2510.00316v2 Announce Type: replace
Abstract: Identifying wireless modulation schemes is essential for cognitive radio, but standard supervised models often degrade under distribution shift, and training domain-specific wireless foundation model…

cs.LG

Can AI Scientist Agents Learn from Lab-in-the-Loop Feedback? Evidence from Iterative Perturbation Discovery

/ March 30, 2026

arXiv:2603.26177v1 Announce Type: new
Abstract: Recent work has questioned whether large language models (LLMs) can perform genuine in-context learning (ICL) for scientific experimental design, with prior studies suggesting that LLM-based agents exhib…

cs.LG

ReBaPL: Repulsive Bayesian Prompt Learning

/ March 30, 2026

arXiv:2511.17339v2 Announce Type: replace
Abstract: Prompt learning has emerged as an effective technique for fine-tuning large-scale foundation models for downstream tasks. However, conventional prompt learning methods are prone to overfitting and ca…

cs.LG, cs.PF, math.OC, math.PR

Optimization Trade-offs in Asynchronous Federated Learning: A Stochastic Networks Approach

/ March 30, 2026

arXiv:2603.26231v1 Announce Type: new
Abstract: Synchronous federated learning scales poorly due to the straggler effect. Asynchronous algorithms increase the update throughput by processing updates upon arrival, but they introduce two fundamental cha…

cs.LG

Knowledge Distillation for Efficient Transformer-Based Reinforcement Learning in Hardware-Constrained Energy Management Systems

/ March 30, 2026

arXiv:2603.26249v1 Announce Type: new
Abstract: Transformer-based reinforcement learning has emerged as a strong candidate for sequential control in residential energy management. In particular, the Decision Transformer can learn effective battery dis…

cs.LG

Improving Risk Stratification in Hypertrophic Cardiomyopathy: A Novel Score Combining Echocardiography, Clinical, and Medication Data

/ March 30, 2026

arXiv:2603.26254v1 Announce Type: new
Abstract: Hypertrophic cardiomyopathy (HCM) requires accurate risk stratification to inform decisions regarding ICD therapy and follow-up management. Current established models, such as the European Society of Car…