Posts
All the articles I've posted.
GPTQ Math Derivation
Posted on:September 9, 2023 at 12:00 AMThis blog post traces the development of GPTQ, starting from its roots in OBD, through OBS, and finally to OBC.
Triton Tutorial #1
Posted on:September 5, 2023 at 12:00 AMsecond blogpost of triton tutorial series, fused softmax, debug and benchmarking it.
Triton Tutorial #0
Posted on:September 2, 2023 at 12:00 AMfirst blogpost of triton tutorial series, triton introduction, installation and vector-add example
Benchmark for LLM Inference
Posted on:August 20, 2023 at 12:00 AMIntroduce some metrics for LLM inference benchmarking
RoPE and Length Scaling
Posted on:August 10, 2023 at 12:00 AMIntroduce some basic concepts of Position Encoding, RoPE and length extrapolation related it.