Technical Blog
Writing FlashAttention in Triton (Part 2): From the Algorithm to a Real Kernel, and Fusing RoPE
Jun 15, 2026
Writing FlashAttention in Triton (Part 1): The Memory Wall and the Online Softmax Trick
May 26, 2026
Transformers (Decoder-Only) (Part 2)
Jan 26, 2026
Algorithms (Deep Learning Ops)
Sep 27, 2025
Generative Adversarial Networks (GANs)
Sep 27, 2025
Transformers (Decoder-Only) (Part 1)
Jun 18, 2025
GPU Programming
Apr 2, 2025
Understanding Metal and MSL
Jan 30, 2024
Deep Learning
Dec 25, 2023