RoPE and Length Scaling

Introduce some basic concepts of Position Encoding, RoPE and length extrapolation related it.

August 10, 2023 · 8 min · Sherlock

CodeLLM Training Recipe

一个偏综述的文章,总结 codeLLM 相关 paper 从 data collection 到 training 中间的一些细节

July 26, 2023 · 8 min · Sherlock

如何做 continued pre-train

介绍一下 continued pre-train

July 4, 2023 · 2 min · Sherlock