LayerNorm Mathematical Derivation and Implementation
Comprehensive mathematical derivation of LayerNorm forward and backward passes, including PyTorch implementation details.
Comprehensive mathematical derivation of LayerNorm forward and backward passes, including PyTorch implementation details.
third blogpost of triton tutorial series, gemm and autotune.
second blogpost of triton tutorial series, fused softmax, debug and benchmarking it.
first blogpost of triton tutorial series, triton introduction, installation and vector-add example
介绍一下 continued pre-train
记录一次找 TensorRT FP16 和 PyTorch 推理结果不一致的经历
记录配置和使用 VSCode 的流程,演示一些深度学习中 debug 的例子
记录在 oneflow 中开发 userOp 的流程以及中间遇到的一些问题
从数学和实现的角度解释 AutoDiff 的原理,给出一个简单的代码实现
L2 reg 和 weight decay 的区别和联系