MindSpore Fully Supporting DeepSeek V3 Training and Inference
MindSpore Fully Supporting DeepSeek V3 Training and Inference
Following the support of training and inference for DeepSeek V3 671B on Ascend clusters by MindSpore, the MindSpore version of DeepSeek V3 has also released its fine-tuning capabilities. As a result, MindSpore now fully supports end-to-end training and inference for DeepSeek V3.
Open Source Links
MindSpore open-source community
DeepSeek V3 training and fine-tuning code:
https://gitee.com/mindspore/mindformers/tree/dev/research/deepseek3
Modelers community
DeepSeek V3 inference code:
https://modelers.cn/models/MindSpore-Lab/DeepSeek-V3
The links contain comprehensive step-by-step tutorials, enabling developers to get started effortlessly.
DeepSeek V3 MindSpore Fine-Tuning Capability Released
MindSpore Transformers supports full-parameter fine-tuning of DeepSeek V3. By following these steps, you can quickly initiate the fine-tuning process on a single Atlas 800T A2 (64 GB):
Environment preparation > Dataset preparation > Model weight preparation > Configuration modification > Task execution
If you have any questions or suggestions while using the model, please provide feedback through one of the following communities:
Discuss the use of DeepSeek V3 in the MindSpore open-source community:
https://gitee.com/mindspore/mindformers/issues/IBL0X5?from=project-issue
Discuss the use of DeepSeek V3 in the MindSpore forum of the Ascend community:
https://www.hiascend.com/forum/thread-02112174450796469017-1-1.html