High-Rate Quantized Matrix Multiplication II
arXiv:2605.13768v1 Announce Type: cross
Abstract: This is the second part of the work investigating quantized matrix multiplication (MatMul). In part I we considered the case of calibration-free quantization, whereas here we discuss the setting where …