LocalLLaMA

Cloudflare open-sources lossless LLM compression tool

Cloudflare released Unweight, a lossless compression system that reduces LLM size by 15–22% without sacrificing output accuracy. On Meta's Llama-3.1-8B, the tool saves roughly 3 GB of VRAM by compressing MLP weights on Nvidia H100 GPUs. Cloudflare…