AI RESEARCH
Ensembling Pruned Attention Heads For Uncertainty-Aware Efficient Transformers
arXiv CS.LG
•
ArXi:2510.18358v2 Announce Type: replace Uncertainty quantification (UQ) is essential for deploying deep neural networks in safety-critical settings. Although methods like Deep Ensembles achieve strong UQ performance, their high computational and memory costs hinder scalability to large models. We