TritonSigmoid: A fast, padding-aware sigmoid attention kernel for GPUs [R]
We are open-sourcing TritonSigmoid — a fast, padding-aware sigmoid attention kernel for GPUs. We built this for single-cell foundation models, where every cell is represented as a sequence of genes. A single gene can be regulated by multiple transcript…