Ensembling Pruned Attention Heads For Uncertainty-Aware Efficient Transformers
arXiv:2510.18358v2 Announce Type: replace
Abstract: Uncertainty quantification (UQ) is essential for deploying deep neural networks in safety-critical settings. Although methods like Deep Ensembles achieve strong UQ performance, their high computation…