The NVIDIA Kimi-K2.6-NVFP4 model is the quantized version of the Moonshot AI's Kimi-K2.6 model, which is an auto-regressive language model that uses an optimized transformer architecture. For more information, please check here. The NVIDIA Kimi-K2.6 NVFP4 model is quantized with Model Optimizer.
This model is ready for commercial/non-commercial use.
The accuracy benchmark results are presented in the table below:
| Precision | GPQA Diamond | SciCode | τ²-Bench Telecom | MMMU Pro | AA-LCR | IFBench |
|---|---|---|---|---|---|---|
| Baseline (INT4) | 90.9 | 52.6 | 98.2 | 75.6 | 71.0 | 73.9 |
| NVFP4 | 90.4 | 54.4 | 98.0 | 76.5 | 71.8 | 73.9 |
Baseline: Kimi-K2.6 in its native INT4 format. Benchmarked with temperature=1.0, top_p=0.95, max num tokens 128000.
Links:
[link] [comments]