AI RESEARCH
Architecture Determines Observability in Transformers
arXiv CS.LG
•
ArXi:2604.24801v1 Announce Type: new Autoregressive transformers make confident errors, but activation monitoring can catch them only if the model preserves an internal signal that output confidence does not expose. This preservation is determined by architecture and