MindSpore Transformers Contribution Guidelines
Contributing Code to MindSpore Transformers
Code Style Requirements
Please follow this style for MindSpore Transformers review, maintenance and development.
Coding Guide
The MindSpore Transformers community uses the
Python PEP 8coding style. It is recommended to install the following plugins in your IDE to check code format:Lizard,ShellCheckandPyLint.Unit Testing Guide
The MindSpore Transformers community uses the Python unit testing framework pytest. Annotation names need to reflect the design intent of the test case.
Reconstruction Guide
We encourage developers to reconstruct our code to eliminate code bad taste. All code should conform to coding style and testing style, and reconstructing code is no exception. The Lizard threshold for uncommented lines of code (nloc) is 100, and the cyclomatic complexity (ccn) threshold is 20. When a Lizard warning is received, the code to be merged must be reconstructed.
Documentation Guide
We use MarkdownLint to check Markdown document format. The following rules are modified based on the default configuration:
MD007 (unordered list indent): the parameter indent is set to 4, indicating that all the contents of the unordered list need to be indented by 4 spaces.
MD009 (space at the end of the line): the parameter br_spaces is set to 2, indicating that there can be either 0 or 2 spaces at the end of the line.
MD029 (sequence number of ordered list): the parameter style is set to ordered, indicating ascending order.
Fork-Pull Development Model Guide
Fork MindSpore Transformers code repository
Before submitting code to the MindSpore Transformers project, please make sure that you have forked this project to your own code repository. There may be parallel development between the MindSpore Transformers code repository and your own code repository, so please be aware of the consistency between them.
Clone remote code repository
If you want to download the code to your local computer, it is best to use the git method.
# Clone repositories on AtomGit git clone https://atomgit.com/(insert_your_forked_repo)/mindformers.git
Local Development Code
masteris the development branch. Please pull the latest code frommasterbranch for development. And submit it to themasterbranch when you submit your Pull Request.git checkout -b {new branch name} origin/master
Submit PR to MindSpore Transformers code repository
In the last step, you need to pull a compare request between the new branch and the
MindSpore Transformersmaster branch. After completing the pull request,Jenkins CIwill be automatically set up for build testing. PR should be merged into the upstream master branch as soon as possible to minimize the risk of merging.# Add all changes to the staging area git add . # Check Update Status git status # To commit changes, add a commit header with the -m option git commit -m "The title of your commit" # Add a specific description of the commit, add a signature with the -s option, and modify the most recent commit with the `--amend` option. git commit -s --amend # Push changes to a new branch in the remote repository git push origin {New branch name}
Documentation and Code Format
If you wish to merge custom models into the MindSpore Transformers code repository, there are a few things to keep in mind:
The file format and location should follow the norms.
Register the new model in the code to adapt it for higher-order interface use.
File Format and Location
The model code files are placed uniformly in the
research/{model_name}folder in the following format.research/{model_name} ├── {model_name} | ├── {pretrain/finetune/predict}_{model_name}_{n}b.yaml | ├── convert_weight.py # Torch weights to MindSpore weights script (required for migration models) | ├── convert_reversed.py # MindSpore weights to Torch weights script (required for migration models) | ├── run_{model_name}.py # Running the code file | ├── {model_name}.py # Model class code file | └── {model_name}_tokenizer.py # Tokenizer Code FileModel documentation is placed in the same
research/{model_name}folder.
Requirements for Submitting A PR
Only One Commit
For multi-commit PRs, use the squash command to merge multiple commits into one. For example, use:
git rebase -i HEAD~3
You can see:
pick 1234567 Add new function A
pick 89abcdef Fixed bugs in A
pick 01234567 Some optimizations to A
squash merge commit (can be simplified to abbreviations such as s, p, f, etc.)
pick 1234567 Add new function A
pick 89abcdef Fixed bugs in A
pick 01234567 Some optimizations to A
PR Descriptions
Please use the following md template.
### Related Issue
### Reason (purpose, problem solved, etc.)
### Description (what was done, what was changed)
### Checklist
#### Was a program review or root cause analysis of the problem completed (Y/N)
#### Whether UT/ST of functional modules was completed, executed and passed with results attached (Y/N)
#### Whether it involves modification of public components or external interfaces, and if so, the scope of modification and impact assessment should be given (Y/N)
#### Whether it involves the modification of information, and if so, the modification should be synchronized (Y/N)
Access Control Requirements
Submitting a PR requires signing a CLA.
Submitting a PR requires passing the CI check, which needs to be manually restarted by commenting
/retestunder comments after the gate fails and the code is corrected.
Test Case Contribution
Organization Structure
Directory Structure
tests/
├── st/ # System Testing: Verify end-to-end workflows of multi-component collaboration
│ ├── test_auto_register/ # Test automatic registration of custom models/operators
│ ├── test_ckpt_health_monitor/ # Test integrity check of model checkpoints
│ ├── test_docs/ # Test runnability of code examples in documentation
│ ├── test_grace_exit_save_ckpt/ # Test checkpoint saving during training interruption
│ ├── test_infer/ # Test single-card/multi-card/offline inference workflows
│ ├── test_model/ # Test consistency of model execution across multiple devices
│ ├── test_multi_cards_cases/ # Test multi-card distributed training/inference
│ ├── test_optim/ # Test optimizers/learning rate/mixed precision training
│ ├── test_resume/ # Test training resumption from breakpoints
│ ├── test_safetensors/ # Test loading/saving of Safetensors checkpoints
├── utils/ # Test Utility Library: Data generation, device detection, etc.
├── conftest.py # pytest Global Configuration: Environment check, initialization
Basic Specifications
Test Case Marking Rules:
NPU test cases: @pytest.mark.platform_arm_ascend910b_training
CPU test cases: @pytest.mark.platform_x86_cpu
Single-card test cases: @pytest.mark.env_onecard
Multi-card test cases (8 cards by default): @pytest.mark.platform_env_single
Test Case Development Specifications:
Test cases generate cache files in the directory of the test file
Add execution rule-related marks above methods (including class methods) for all test cases, not above classes
Test files start with "test_". Classes start with "Test". Methods start with "test"
Test Case Level Specifications:
Level 0: Combined interface test cases (only parallel interfaces are classified into this level)
Level 1: Full-network function test cases, single-card test cases of parallel computing interfaces, atomic interface test cases
Execution Examples
Install Dependencies
pip3 install -r requirements.txt
Execute a Single Test File
pytest tests/st/test_demo.pyExecute with Mark Filtering
# Filter test cases with specified marks using the -m parameter # Execute all npu single-card test cases pytest test_demo.py -v -m "platform_arm_ascend910b_training and env_onecard"
Execute a Single Test Method
# Execute X86 CPU single-card training test case pytest test_demo.py::TestMyModelTrainPredict::test_train_ascend_single_card -v
Test Case Example
The following is a complete implementation of tests/st/test_demo.py that complies with specifications, covering core scenarios of CPU single-card, Ascend single-card training, and Ascend multi-card inference:
import pytest
import numpy as np
import mindspore as ms
from mindspore import nn
from mindspore.dataset import GeneratorDataset
from mindformers import Trainer, TrainingArguments
from mindformers.core.optim import AdamW
from mindformers.tools.logger import logger
from mindformers.models.llama import LlamaForCausalLM, LlamaConfig
from mindformers.trainer.optimizer_grouped_parameters import (
get_optimizer_grouped_parameters,
)
# Naming Convention: Test files start with test_, classes start with Test, format: Test + Model Name + Core Function
class TestSimpleCPUModel(nn.Cell):
def __init__(self):
super().__init__()
self.fc1 = nn.Dense(16, 8)
self.relu = nn.ReLU()
self.fc2 = nn.Dense(8, 2)
def construct(self, x):
x = self.fc1(x)
x = self.relu(x)
x = self.fc2(x)
return x
class TestMyModelTrainPredict:
@classmethod
def setup_class(cls):
"""Class-level initialization: Initialize model / training configuration"""
ms.set_device("Ascend")
cls.num_layers = 2
cls.seq_length = 2
cls.vocab_size = 32000
cls.step_num = 1
cls.model_config = LlamaConfig(
num_layers=cls.num_layers,
seq_length=cls.seq_length,
use_flash_attention=True,
)
cls.train_args = TrainingArguments(
batch_size=1, num_train_epochs=1, sink_mode=False, loss_scale_value=1024
)
def gen_dummy_data(self):
"""Generate dummy dataset for testing"""
size = (
self.step_num * self.train_args.batch_size,
self.model_config.seq_length + 1,
)
input_ids = np.random.randint(low=0, high=self.vocab_size, size=size).astype(
np.int32
)
for _, input_id in enumerate(input_ids):
yield input_id
@pytest.mark.level1
@pytest.mark.platform_x86_cpu
@pytest.mark.env_onecard
def test_train_x86_cpu_single_card(self):
"""
Feature: mindformers model train
Description: Test X86 architecture CPU single-card model training and inference
Expectation: success
"""
def gen_data():
for _ in range(5):
data = np.random.rand(16).astype(np.float32)
label = np.array(0, dtype=np.int32)
yield data, label
dataset = GeneratorDataset(gen_data, column_names=["data", "label"])
dataset = dataset.batch(batch_size=2)
net = TestSimpleCPUModel()
loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction="mean")
optim = nn.Adam(net.trainable_params(), learning_rate=0.001)
model = ms.Model(net, loss_fn=loss, optimizer=optim)
model.train(epoch=1, train_dataset=dataset, dataset_sink_mode=False)
test_input = ms.Tensor(np.random.rand(16).astype(np.float32))
output = net(test_input)
assert net is not None
assert output is not None
logger.info("X86 CPU single card training test passed!")
@pytest.mark.level0
@pytest.mark.platform_arm_ascend910b_training
@pytest.mark.env_onecard
def test_train_ascend_single_card(self):
"""
Feature: mindformers model train
Description: Test Atlas800T A2 single-card Llama model training
Expectation: success
"""
dataset = GeneratorDataset(self.gen_dummy_data, column_names=["input_ids"])
dataset = dataset.batch(batch_size=self.train_args.batch_size)
model = LlamaForCausalLM(self.model_config)
model.construct = ms.jit(jit_level="O1")(model.construct)
group_params = get_optimizer_grouped_parameters(model=model)
optimizer = AdamW(params=group_params)
trainer = Trainer(
task="text_generation",
model=model,
args=self.train_args,
train_dataset=dataset,
optimizers=optimizer,
)
trainer.config.callbacks = trainer.config.callbacks[:1]
train_result = trainer.train()
if train_result is None:
train_result = {"loss":0.0}
assert model is not None, "Model initialization failed after training"
assert train_result is not None, "Training returned no result"
logger.info("Ascend single card training test passed!")
@pytest.mark.level1
@pytest.mark.platform_arm_ascend910b_training
@pytest.mark.env_single
def test_predict_ascend_multi_card(self):
"""
Feature: mindformers model predict
Description: Test Atlas800T A2 multi-card Llama model inference
Expectation: success
"""
model = LlamaForCausalLM(self.model_config)
output = model.generate([1], max_length=5, do_sample=False)
assert output is not None, "Inference output is empty"
logger.info("Ascend multi card inference test passed!")
if __name__ == "__main__":
# Local debug execution: Default npu single-card test case
pytest.main(["-v", __file__, "-m", "platform_arm_ascend910b_training and env_onecard"])