SmoothQuant and AWQ

This blog post compares SmoothQuant and AWQ differences and their code implementation.

October 8, 2023 · 12 min · Sherlock

GPTQ Code Implementation

This blog post delved into the code implementation of the GPTQ quantization process, using the Llama model as a case study.

September 18, 2023 · 14 min · Sherlock

GPTQ Math Derivation

This blog post traces the development of GPTQ, starting from its roots in OBD, through OBS, and finally to OBC.

September 9, 2023 · 7 min · Sherlock

Benchmark for LLM Inference

Introduce some metrics for LLM inference benchmarking

August 20, 2023 · 5 min · Sherlock

RoPE and Length Scaling

Introduce some basic concepts of Position Encoding, RoPE and length extrapolation related it.

August 10, 2023 · 8 min · Sherlock

CodeLLM Training Recipe

一个偏综述的文章,总结 codeLLM 相关 paper 从 data collection 到 training 中间的一些细节

July 26, 2023 · 8 min · Sherlock

WizardLM(Coder) 和 Ocra 的一些理解

介绍一下最近看到的两篇关于 SIFT 数据相关的非常好的论文 WizardLM(WizardCoder) 和 Ocra,以及我对这个问题的一些思考

July 22, 2023 · 6 min · Sherlock

如何做 continued pre-train

介绍一下 continued pre-train

July 4, 2023 · 2 min · Sherlock