AI RESEARCH

Average Attention Transformers and Arithmetic Circuits

arXiv CS.AI

ArXi:2605.04683v1 Announce Type: cross We analyse the computational power of transformer encoders as sequence-to-sequence functions on vectors. We show that average hard attention can be used to simulate arithmetic circuits if they are given as an input to an encoder. The circuit families that can be simulated this way have constant depth while using unbounded addition, binary multiplication and sign gates. The transformers we use have arithmetic circuits instead of feed-forward networks.