🧶Hi, I'm Sherlock!
Greetings, readers, welcome to my blog!
I am Sherlock, an engineer specializing in LLM, computer vision, and deep learning. Recently, I delve into various aspects of LLM, including pre-training, SFT, RLHF, and the related infra.
Featured
LLM Quantization Review
Posted on:October 2, 2023 at 12:00 AMThis blog post provides an overview of the fundamental concepts of quantization, as well as a review of mainstream quantization methods in the context of LLMs.
Recent Posts
MXFP4 and NVFP4
Posted on:August 28, 2025 at 12:00 AM介绍 MXFP4 和 NVFP4 的区别
W4A8KV4 Quantization Summary and Best Practices
Posted on:August 30, 2024 at 12:00 AMComprehensive summary of W4A8KV4 quantization techniques, covering KV4 and W4A8 optimization methods with practical recommendations.
Low-Bit MoE Quantization for Large Language Models
Posted on:July 25, 2024 at 12:00 AMComprehensive guide to quantizing large MoE models like DeepSeek-V3/R1, covering techniques for efficient memory usage and inference optimization.
Speculative Sampling for Faster LLM Inference
Posted on:June 20, 2024 at 12:00 AMDeep dive into speculative sampling techniques for accelerating LLM inference through draft model prediction and rejection sampling.