MindSpore Fully Supporting DeepSeek V3 Training and Inference

MindSpore Fully Supporting DeepSeek V3 Training and Inference

MindSpore Fully Supporting DeepSeek V3 Training and Inference

Following the support of training and inference for DeepSeek V3 671B on Ascend clusters by MindSpore, the MindSpore version of DeepSeek V3 has also released its fine-tuning capabilities. As a result, MindSpore now fully supports end-to-end training and inference for DeepSeek V3.

Open Source Links

MindSpore open-source community

DeepSeek V3 training and fine-tuning code:

https://gitee.com/mindspore/mindformers/tree/dev/research/deepseek3

Modelers community

DeepSeek V3 inference code:

https://modelers.cn/models/MindSpore-Lab/DeepSeek-V3

The links contain comprehensive step-by-step tutorials, enabling developers to get started effortlessly.

DeepSeek V3 MindSpore Fine-Tuning Capability Released

MindSpore Transformers supports full-parameter fine-tuning of DeepSeek V3. By following these steps, you can quickly initiate the fine-tuning process on a single Atlas 800T A2 (64 GB):

Environment preparation > Dataset preparation > Model weight preparation > Configuration modification > Task execution

If you have any questions or suggestions while using the model, please provide feedback through one of the following communities:

Discuss the use of DeepSeek V3 in the MindSpore open-source community:

https://gitee.com/mindspore/mindformers/issues/IBL0X5?from=project-issue

Discuss the use of DeepSeek V3 in the MindSpore forum of the Ascend community:

https://www.hiascend.com/forum/thread-02112174450796469017-1-1.html