Release Notes
MindSpore Golden Stick 1.2.0 Release Notes
Major Features and Improvements
The Post-Training Quantization algorithm
PTQ
supports the MOE structure withSmoothQuant-A8W8
quantization algorithm andGPTQ-A16W4
low-bit quantization algorithm. These have been adapted for the DeepSeekV3/R1 model.Added OutlierSuppression-Lite(OSL), an outlier suppression technique. OSL is an extension of SmoothQuant that tunes the migration strength alpha through hyperparameter optimization at the matrix granularity to achieve more fine-grained network self-adaptive calibration. In the DeepSeek V3-0324 network function call scenario, OSL preserves higher accuracy and achieves BFCL scores on par with the official FP8 baselines.
[Demo] The Post-Training Quantization algorithm
PTQ
supports theA8W4
quantization algorithm. These have been adapted for the DeepSeekV3/R1 model.Added the loading and evaluation of the datasets
wikitext
,boolq
,ceval
, andgsm8k
.Accuracy of DeepSeekR1:
method
ceval
gsm8k
BF16
89.67
91.74
SmoothQuant-A8W8
89.45
92.42
OSL-A8W8
89.9
91.81
GPTQ-A16W4
89.52
91.12
A8W4
89.0
91.51
Contributors
Thanks goes to these wonderful people:
tongl, zhuxiaochen, guoguopot, huangzhuo, ccsszz, yyyyrf, hangangqiang, HeadSnake
Contributions of any kind are welcome!