LLM Quantization Review

This blog post provides an overview of the fundamental concepts of quantization, as well as a review of mainstream quantization methods in the context of LLMs.

October 2, 2023 · 11 min · Sherlock

GPTQ Code Implementation

This blog post delved into the code implementation of the GPTQ quantization process, using the Llama model as a case study.

September 18, 2023 · 14 min · Sherlock

GPTQ Math Derivation

This blog post traces the development of GPTQ, starting from its roots in OBD, through OBS, and finally to OBC.

September 9, 2023 · 7 min · Sherlock