MindSpore Transformers

Introduction

  • Overall Structure
  • Models

Installation

  • Installation Guidelines

Full-process Guide to Large Models

  • Pretraining
  • Supervised Fine-Tuning (SFT)
  • Inference
  • Service Deployment
  • Evaluation

Features

  • Start Tasks
  • Ckpt Weights
  • Safetensors Weights
  • Configuration File Descriptions
  • Loading Hugging Face Model Configuration
  • Logs
  • Training Function
  • Inference Function
  • Using Tokenizer

Advanced Development

  • Large Model Precision Optimization Guide
  • Large Model Performance Optimization Guide
  • Development Migration
  • Guide to Using the Inference Configuration Template
  • Comparison of Reasoning Precision
  • Comparing the Model Precision with that of Megatron-LM
  • Training Configuration Template Instruction
  • Weight Conversion Development Adaptation
  • API

Excellent Practice

  • Practice Case of Using DeepSeek-R1 for Model Distillation

Environment Variables

  • Environment Variable Descriptions

Contribution Guide

  • MindSpore Transformers Contribution Guidelines
  • Modelers Contribution Guidelines

FAQ

  • Model-Related FAQ
  • Feature-Related FAQ
    • Q: What is the difference between the names MindSpore Transformers and MindFormers?
    • Q: What is the difference between the MindSpore Transformers and MindSpore NLP suites?
    • Q: The WikiText dataset download link is not available.
    • Q: How Do I Generate a Model Sharding Strategy File?
    • Q: How Can I Do When socket.gaierror: [Errno -2] Name or service not known or socket.gaierror: [Errno -3] Temporary failure in name resolution is Reported in ranktable Generation File?
    • Q: When installing MindSpore Transformers from source code, the download speed of dependency packages is slow. How can this be resolved?
MindSpore Transformers
  • »
  • Feature-Related FAQ
  • View page source

Feature-Related FAQ

View Source On Gitee

Q: What is the difference between the names MindSpore Transformers and MindFormers?

A: Both refer to the same suite. MindSpore Transformers is the suite's official name; MindFormers is its abbreviated name, serving as both the repository name and the designation used within the code.


Q: What is the difference between the MindSpore Transformers and MindSpore NLP suites?

A: MindSpore Transformers is MindSpore's large-model suite, primarily designed for training and inference of large language models (LLMs) and Multi-modal models (MMs) in large-scale scenarios. MindSpore NLP is MindSpore's domain-specific suite, primarily designed for training small-to-medium-sized models in the natural language processing (NLP) domain. The two differ in their positioning; users may select the appropriate one based on their requirements.


Q: The WikiText dataset download link is not available.

A: The official download link is not available, please follow the community Issue #IBV35D.


Q: How Do I Generate a Model Sharding Strategy File?

A: The model sharding strategy file documents the sharding strategy for model weights in distributed scenarios and is generally used when slicing weights offline. Configure only_save_strategy: True in the network yaml file, and then start the distributed task normally, then the distributed strategy file can be generated in the output/strategy/ directory. For details, please refer to the Tutorial on Slicing and Merging Distributed Weights.


Q: How Can I Do When socket.gaierror: [Errno -2] Name or service not known or socket.gaierror: [Errno -3] Temporary failure in name resolution is Reported in ranktable Generation File?

A: Starting from MindSpore Transformers r1.2.0 version, cluster startup is unified using msrun method, and ranktable startup method is deprecated.


Q: When installing MindSpore Transformers from source code, the download speed of dependency packages is slow. How can this be resolved?

A: The build.sh script uses the Tsinghua Mirror to download the Python packages required by MindSpore Transformers. To change the mirror source, you can modify the download command in build.sh: pip install mindformers*whl -i https://pypi.tuna.tsinghua.edu.cn/simple , replace the URL after -i with the address of your desired mirror source.


Previous

© Copyright MindSpore.

Built with Sphinx using a theme provided by Read the Docs.