Triton Tutorial - Series 2

Build a high-performance matrix multiplication kernel in Triton that rivals cuBLAS performance with step-by-step optimization.

September 20, 2023 · 9 min · Sherlock

Triton Tutorial - Series 1

Learn to write a fused softmax kernel in Triton, with debugging and performance benchmarking techniques.

September 5, 2023 · 7 min · Sherlock

Triton Tutorial - Series 0

Introduction to Triton programming language, installation guide, and vector-addition example to get started with GPU kernel development.

September 2, 2023 · 5 min · Sherlock