MoE | The Distillery

Low-Bit MoE Quantization for Large Language Models

Comprehensive guide to quantizing large MoE models like DeepSeek-V3/R1, covering techniques for efficient memory usage and inference optimization.