Dual Triangle Attention: Effective Bidirectional Attention Without Positional Embeddings

ArXi:2604.18603v1 Announce Type: cross Bidirectional transformers are the foundation of many sequence modeling tasks across natural, biological, and chemical language domains, but they are permutation-invariant without explicit positional embeddings. In contrast, unidirectional attention inherently encodes positional information through its triangular mask, enabling models to operate without positional embeddings altogether. Here, we