Thomas Witt - Provide.ai

XFP: Quality-Targeted Adaptive Codebook Quantization with Sparse Outlier Separation for LLM Inference

Thomas Witt / May 15, 2026

arXiv:2605.14844v1 Announce Type: cross
Abstract: We introduce XFP, a dynamic weight quantizer for LLM inference that inverts the conventional workflow: the operator specifies reconstruction quality floors on per-channel cosine similarity (one strict …

Author name: Thomas Witt

XFP: Quality-Targeted Adaptive Codebook Quantization with Sparse Outlier Separation for LLM Inference