UAT-LITE: Inference-Time Uncertainty-Aware Attention for Pretrained Transformers

ArXi:2602.02952v2 Announce Type: replace Neural NLP models are often miscalibrated and overconfident, assigning high confidence to incorrect predictions and failing to express uncertainty during internal evidence aggregation. This undermines selective prediction and high-stakes deployment. Post-hoc calibration methods adjust output probabilities but leave internal computation unchanged, while ensemble and Bayesian approaches improve uncertainty at substantial