LLM | The Distillery

SmoothQuant and AWQ

This blog post compares SmoothQuant and AWQ differences and their code implementation.

This blog post delved into the code implementation of the GPTQ quantization process, using the Llama model as a case study.

This blog post traces the development of GPTQ, starting from its roots in OBD, through OBS, and finally to OBC.

Introduce some metrics for LLM inference benchmarking

Introduce some basic concepts of Position Encoding, RoPE and length extrapolation related it.

一个偏综述的文章，总结 codeLLM 相关 paper 从 data collection 到 training 中间的一些细节

介绍一下最近看到的两篇关于 SIFT 数据相关的非常好的论文 WizardLM(WizardCoder) 和 Ocra，以及我对这个问题的一些思考

介绍一下 continued pre-train