FlashAttention-4 gives the NVIDIA Blackwell platform its most optimized attention kernel yet
On March 5, 2026, the much-anticipated paper for FlashAttention-4 (FA4) was published. The code was dropped on GitHub months ago; early benchmarks circulated, and preliminary results were presented at Hot Chips in August 2025. Now we have the …