CalexNet: Soft Cascade-Aligned Training and Calibration for Lightweight Early-Exit Branches

arXiv:2509.08318v2 Announce Type: replace Abstract: Early-exit cascades over a frozen convolutional backbone enable adaptive inference but suffer from three sources of train-inference mismatch: branches train on samples they will never see at inference, their per-class precision thresholds are calibrated on the wrong distribution, and the standard cross-entropy target on backbone argmax labels discards the backbone's uncertainty signal. We close all three gaps with CalexNet (Cascade-Aligned Early eXits), a training-recipe-only modification: branches train under continuously-weighted importance sampling that matches the cascade-survivor distribution; per-class precision thresholds are calibrated on the actual cascade-survivor subset of the validation set; and the classification head is trained against the backbone's full softmax via a temperature-scaled KL objective. Combined with an augmented prototype-pooling branch head, CalexNet is evaluated on ResNet18 and ResNet50 backbones across CIFAR-100 (20-superclass coarse, the harder primary setting) and CINIC-10 (10-class, the easier cross-validation counterpart). On the accuracy-FLOPs Pareto frontier, CalexNet matches or exceeds three published baselines (PTEEnet, ZTW, BoostNet) and a within-paper "no-alignment, no-KD" reference. The largest gains appear in the practically relevant 30-70% FLOPs-reduction regime and are stable across n=3 training seeds. CalexNet requires no inference-time architectural change and is a drop-in for any frozen-backbone early-exit cascade.

Leave a Comment