TritonSigmoid: A fast, padding-aware sigmoid attention kernel for GPUs [R]

We are open-sourcing TritonSigmoid - a fast, padding-aware sigmoid attention kernel for GPUs. We built this for single-cell foundation models, where every cell is represented as a sequence of genes. A single gene can be regulated by multiple transcription factors at once. Softmax forces them to compete for attention, but sigmoid lets the model attend strongly to many genes (tokens) simultaneously. Because cells express anywhere from 200 to 16,000+ genes (tokens), the kernel handles variable-length padding natively so you're not wasting compute on empty positions.