cs.AI

BitCal-TTS: Bit-Calibrated Test-Time Scaling for Quantized Reasoning Models

arXiv:2605.05561v1 Announce Type: new
Abstract: Post-training quantization makes large reasoning models practical under tight memory and latency budgets, but it can distort the online signals that drive adaptive test-time compute allocation. Under a f…