Network Models

Data Processing Environmental Setup Model Export Model Training Beginner Intermediate Expert

Q: When MindSpore is used for model training, there are four input parameters for CTCLoss: inputs, labels_indices, labels_values, and sequence_length. How do I use CTCLoss for model training?

A: The dataset received by the defined model.train API can consist of multiple pieces of data, for example, (data1, data2, data3, …). Therefore, the dataset can contain inputs, labels_indices, labels_values, and sequence_length information. You only need to define the dataset in the corresponding format and transfer it to model.train. For details, see Data Processing API.

Q: How do I load the PyTorch weight to MindSpore during model transfer?

A: First, enter the PTH file of PyTorch. Take ResNet-18 as an example. The network structure of MindSpore is the same as that of PyTorch. After transferring, the file can be directly loaded to the network. Only BN and Conv2D are used during loading. If the network names of MindSpore and PyTorch at other layers are different, change the names to the same.

Q: After a model is trained, how do I save the model output in text or npy format?

A: The network output is Tensor. You need to use the asnumpy() method to convert the Tensor to NumPy and then save the data. For details, see the following:

out = net(x)

np.save("output.npy", out.asnumpy())

Q: Must data be converted into MindRecords when MindSpore is used for segmentation training?

A: build_seg_data.py is used to generate MindRecords based on a dataset. You can directly use or adapt it to your dataset. Alternatively, you can use GeneratorDataset if you want to read the dataset by yourself.

GenratorDataset example

GeneratorDataset API description

Q: Can MindSpore read a TensorFlow checkpoint?

A: The checkpoint format of MindSpore is different from that of TensorFlow. Although both use the Protocol Buffers, their definitions are different. Currently, MindSpore cannot read the TensorFlow or Pytorch checkpoints.

Q: How do I perform training without processing data in MindRecord format?

A: You can use the customized data loading method GeneratorDataset. For details, click here.

Q: What framework models and formats can be directly read by MindSpore? Can the PTH Model Obtained Through Training in PyTorch Be Loaded to the MindSpore Framework for Use?

A: MindSpore uses protocol buffers (protobuf) to store training parameters and cannot directly read framework models. A model file stores parameters and their values. You can use APIs of other frameworks to read parameters, obtain the key-value pairs of parameters, and load the key-value pairs to MindSpore. If you want to use the .ckpt file trained by a framework, read the parameters and then call the save_checkpoint API of MindSpore to save the file as a .ckpt file that can be read by MindSpore.

Q: How do I use models trained by MindSpore on Ascend 310? Can they be converted to models used by HiLens Kit?

A: Yes. HiLens Kit uses Ascend 310 as the inference core. Therefore, the two questions are essentially the same. Ascend 310 requires a dedicated OM model. Use MindSpore to export the ONNX or AIR model and convert it into an OM model supported by Ascend 310. For details, see Multi-platform Inference.

Q: How do I modify parameters (such as the dropout value) on MindSpore?

A: When building a network, use if self.training: x = dropput(x). During verification, set network.set_train(mode_false) before execution to disable the dropout function. During training, set network.set_train(mode_false) to True to enable the dropout function.

Q: Where can I view the sample code or tutorial of MindSpore training and inference?

A: Please visit the MindSpore official website training and MindSpore official website inference.

Q: What types of model is currently supported by MindSpore for training?

A: MindSpore has basic support for common training scenarios, please refer to Release note for detailed information.

Q: What are the available recommendation or text generation networks or models provided by MindSpore?

A: Currently, recommendation models such as Wide & Deep, DeepFM, and NCF are under development. In the natural language processing (NLP) field, Bert_NEZHA is available and models such as MASS are under development. You can rebuild the network into a text generation network based on the scenario requirements. Please stay tuned for updates on the MindSpore Model Zoo.

Q: How simple can the MindSpore model training code be?

A: MindSpore provides Model APIs except for network definitions. In most scenarios, model training can be completed using only a few lines of code.

Q: How do I use MindSpore to fit functions such as \(f(x)=a \times sin(x)+b\)?

A: The following is based on the official MindSpore linear fitting case.

# The fitting function is：f(x)=2*sin(x)+3.
import numpy as np
from mindspore import dataset as ds
from mindspore.common.initializer import Normal
from mindspore import nn, Model, context
from mindspore.train.callback import LossMonitor

context.set_context(mode=context.GRAPH_MODE, device_target="CPU")

def get_data(num, w=2.0, b=3.0):
    # f(x)=w * sin(x) + b
    # f(x)=2 * sin(x) +3
    for i in range(num):
        x = np.random.uniform(-np.pi, np.pi)
        noise = np.random.normal(0, 1)
        y = w * np.sin(x) + b + noise
        yield np.array([np.sin(x)]).astype(np.float32), np.array([y]).astype(np.float32)

def create_dataset(num_data, batch_size=16, repeat_size=1):
    input_data = ds.GeneratorDataset(list(get_data(num_data)), column_names=['data','label'])
    input_data = input_data.batch(batch_size)
    input_data = input_data.repeat(repeat_size)
    return input_data

class LinearNet(nn.Cell):
    def __init__(self):
        super(LinearNet, self).__init__()
        self.fc = nn.Dense(1, 1, Normal(0.02), Normal(0.02))

    def construct(self, x):
        x = self.fc(x)
        return x

if __name__ == "__main__":

    num_data = 1600
    batch_size = 16
    repeat_size = 1
    lr = 0.005
    momentum = 0.9

    net = LinearNet()
    net_loss = nn.loss.MSELoss()
    opt = nn.Momentum(net.trainable_params(), lr, momentum)
    model = Model(net, net_loss, opt)

    ds_train = create_dataset(num_data, batch_size=batch_size, repeat_size=repeat_size)
    model.train(1, ds_train, callbacks=LossMonitor(), dataset_sink_mode=False)

    print(net.trainable_params()[0], "\n%s" % net.trainable_params()[1])

Q: How do I use MindSpore to fit quadratic functions such as \(f(x)=ax^2+bx+c\)?

A: The following code is referenced from the official MindSpore tutorial code.

Modify the following items to fit \(f(x) = ax^2 + bx + c\):

Dataset generation.
Network fitting.
Optimizer.

The following explains detailed information about the modification:

# The selected optimizer does not support CPUs. Therefore, the GPU computing platform is used for training. You need to install MindSpore of the GPU version.
context.set_context(mode=context.GRAPH_MODE, device_target="GPU")

# Assume that the function to be fitted is f(x)=2x^2+3x+4. Modify the data generation function as follows:
def get_data(num, a=2.0, b=3.0 ,c = 4):
    for i in range(num):
        x = np.random.uniform(-10.0, 10.0)
        noise = np.random.normal(0, 1)
        # For details about how to generate the value of y, see the to-be-fitted objective function ax^2+bx+c.
        y = x * x * a + x * b + c + noise
        # When fitting a*x^2 + b*x +c, a and b are weight parameters, and c is the offset parameter bias. The training data corresponding to the two weights is x^2 and x, respectively. Therefore, the dataset generation mode is changed as follows:
        yield np.array([x*x, x]).astype(np.float32), np.array([y]).astype(np.float32)

def create_dataset(num_data, batch_size=16, repeat_size=1):
    input_data = ds.GeneratorDataset(list(get_data(num_data)), column_names=['data','label'])
    input_data = input_data.batch(batch_size)
    input_data = input_data.repeat(repeat_size)
    return input_data

class LinearNet(nn.Cell):
    def __init__(self):
        super(LinearNet, self).__init__()
        # Two training parameters are input for the full connection function. Therefore, the input value is changed to 2. The first Normal(0.02) automatically allocates random weights to the two input parameters, and the second Normal is the random bias.
        self.fc = nn.Dense(2, 1, Normal(0.02), Normal(0.02))

    def construct(self, x):
        x = self.fc(x)
        return x

if __name__ == "__main__":
    num_data = 1600
    batch_size = 16
    repeat_size = 1
    lr = 0.005
    momentum = 0.9

    net = LinearNet()
    net_loss = nn.loss.MSELoss()
    # RMSProp optimizer with better effect is selected for quadratic function fitting. Currently, Ascend and GPU computing platforms are supported.
    opt = nn.RMSProp(net.trainable_params(), learning_rate=0.1)
    model = Model(net, net_loss, opt)

    ds_train = create_dataset(num_data, batch_size=batch_size, repeat_size=repeat_size)
    model.train(1, ds_train, callbacks=LossMonitor(), dataset_sink_mode=False)

    print(net.trainable_params()[0], "\n%s" % net.trainable_params()[1])