cs.AI, cs.AR, cs.LG

AQPIM: Breaking the PIM Capacity Wall for LLMs with In-Memory Activation Quantization

arXiv:2604.18137v1 Announce Type: cross
Abstract: Processing-in-Memory (PIM) architectures offer a promising solution to the memory bottlenecks in data-intensive machine learning, yet often overlook the growing challenge of activation memory footprint…