Pratik Honavar, Tejpratap GVSL

QKVShare: Quantized KV-Cache Handoff for Multi-Agent On-Device LLMs

Pratik Honavar, Tejpratap GVSL / May 7, 2026

arXiv:2605.03884v1 Announce Type: new
Abstract: Multi-agent LLM systems on edge devices need to hand off latent context efficiently, but the practical choices today are expensive re-prefill or full-precision KV transfer. We study QKVShare, a framework…

Author name: Pratik Honavar, Tejpratap GVSL

QKVShare: Quantized KV-Cache Handoff for Multi-Agent On-Device LLMs