Cell

Ascend GPU CPU Beginner

View Source On Gitee

Overview

The Cell class of MindSpore is the base class for building all networks and the basic unit of a network. When you need to customize a network, you need to inherit the Cell class and override the __init__ and construct methods.

Loss functions, optimizers, and model layers are parts of the network structure and can be implemented only by inheriting the Cell class. You can also customize them based on service requirements.

The following describes the key member functions of the Cell class, “Building a network” will introduce the built-in loss functions, optimizers, and model layers of MindSpore implemented based on the Cell class, and how to use them, as well as describes how to use the Cell class to build a customized network.

Key Member Functions

construct

The Cell class overrides the __call__ method. When the Cell class instance is called, the construct method is executed. The network structure is defined in the construct method.

In the following example, a simple network is built to implement the convolution computing function. The operators in the network are defined in __init__ and used in the construct method. The network structure of the case is as follows: Conv2d -> BiasAdd.

In the construct method, x is the input data, and output is the result obtained after the network structure computation.

import mindspore.nn as nn
import mindspore.ops as ops
from mindspore import Parameter
from mindspore.common.initializer import initializer

class Net(nn.Cell):
    def __init__(self, in_channels=10, out_channels=20, kernel_size=3):
        super(Net, self).__init__()
        self.conv2d = ops.Conv2D(out_channels, kernel_size)
        self.bias_add = ops.BiasAdd()
        self.weight = Parameter(initializer('normal', [out_channels, in_channels, kernel_size, kernel_size]))

    def construct(self, x):
        output = self.conv2d(x, self.weight)
        output = self.bias_add(output, self.bias)
        return output

parameters_dict

The parameters_dict method is used to identify all parameters in the network structure and return OrderedDict with key as the parameter name and value as the parameter value.

There are many other methods for returning parameters in the Cell class, such as get_parameters and trainable_params. For details, see mindspore API.

A code example is as follows:

net = Net()
result = net.parameters_dict()
print(result.keys())
print(result['weight'])

The following information is displayed:

odict_keys(['weight'])
Parameter (name=weight, shape=(20, 10, 3, 3), dtype=Float32, requires_grad=True)

In the example, Net uses the preceding network building case to print names of all parameters on the network and the result of the weight parameter.

cells_and_names

The cells_and_names method is an iterator that returns the name and content of each Cell on the network.

The case simply implements the function of obtaining and printing the name of each Cell. According to the network structure, there is a Cell whose name is nn.Conv2d.

nn.Conv2d is a convolutional layer encapsulated by MindSpore using Cell as the base class. For details, see “Model Layers”.

A code example is as follows:

import mindspore.nn as nn

class Net1(nn.Cell):
    def __init__(self):
        super(Net1, self).__init__()
        self.conv = nn.Conv2d(3, 64, 3, has_bias=False, weight_init='normal')

    def construct(self, x):
        out = self.conv(x)
        return out

net = Net1()
names = []
for m in net.cells_and_names():
    print(m)
    names.append(m[0]) if m[0] else None
print('-------names-------')
print(names)
('', Net1<
  (conv): Conv2d<input_channels=3, output_channels=64, kernel_size=(3, 3),stride=(1, 1),  pad_mode=same, padding=0, dilation=(1, 1), group=1, has_bias=False,weight_init=normal, bias_init=zeros, format=NCHW>
  >)
('conv', Conv2d<input_channels=3, output_channels=64, kernel_size=(3, 3),stride=(1, 1),  pad_mode=same, padding=0, dilation=(1, 1), group=1, has_bias=False,weight_init=normal, bias_init=zeros, format=NCHW>)
-------names-------
['conv']

set_grad

The set_grad API is used to specify whether the network requires gradient. If no parameter is transferred for calling the API, the default value of requires_grad is True, and the backward network needed to compute the gradients will be generated when the forward network is executed.

Take TrainOneStepCell as an example. Its API function is to perform single-step training on the network. The backward network needs to be computed. Therefore, set_grad needs to be used in the initialization method.

A part of the TrainOneStepCell code is as follows:

class TrainOneStepCell(Cell):
    def __init__(self, network, optimizer, sens=1.0):
        super(TrainOneStepCell, self).__init__(auto_prefix=False)
        self.network = network
        self.network.set_grad()
        ......

If using similar APIs such as TrainOneStepCell and GradOperation, you do not need to use set_grad. The internal encapsulation is implemented.

If you need to customize APIs of this training function, call APIs internally or set network.set_grad externally.

set_train

The set_train interface recursively configures the training attributes of the current Cell and all sub-Cell. When called without parameters, the default training attribute is set to True.

When implementing networks with different training and inference structures, the training and inference scenarios can be distinguished by the training attribute, and the execution logic of the network can be switched by combining with set_train when the network is running.

For example, part of the code of nn.Dropout is as follows:

class Dropout(Cell):
    def __init__(self, keep_prob=0.5, dtype=mstype.float32):
        """Initialize Dropout."""
        super(Dropout, self).__init__()
        self.dropout = ops.Dropout(keep_prob, seed0, seed1)
        ......

    def construct(self, x):
        if not self.training:
            return x

        if self.keep_prob == 1:
            return x

        out, _ = self.dropout(x)
        return out

In nn.Dropout, two execution logics are distinguished according to the training attribute of Cell. When training is False, the input is returned directly, and when training is True, the Dropout operator is executed. Therefore, when defining the network, you need to set the execution mode of the network according to the training and inference scenarios. Take nn.Dropout as an example:

import mindspore.nn as nn
net = nn.Dropout()
# execute training
net.set_train()
# execute inference
net.set_train(False)

to_float

The to_float interface recursively configures the coercion type of the current Cell and all sub-Cell so that the current network structure runs with a specific float type. Usually used in mixed precision scenes.

For details of to_float and mixed precision, please refer to Enabling Mixed Precision.

Relationship Between the nn Module and the ops Module

The nn module of MindSpore is a model component implemented by Python. It encapsulates low-level APIs, including various model layers, loss functions, and optimizers.

In addition, nn provides some APIs with the same name as the Primitive operator to further encapsulate the Primitive operator and provide more friendly APIs.

Reanalyze the case of the construct method described above. This case is the simplified content of the nn.Conv2d source code of MindSpore, and ops.Conv2D is internally called. The nn.Conv2d convolution API adds the input parameter validation function and determines whether bias is used. It is an advanced encapsulated model layer.

import mindspore.nn as nn
import mindspore.ops as ops
from mindspore import Parameter
from mindspore.common.initializer import initializer

class Net(nn.Cell):
    def __init__(self, in_channels=10, out_channels=20, kernel_size=3):
        super(Net, self).__init__()
        self.conv2d = ops.Conv2D(out_channels, kernel_size)
        self.bias_add = ops.BiasAdd()
        self.weight = Parameter(initializer('normal', [out_channels, in_channels, kernel_size, kernel_size]))

    def construct(self, x):
        output = self.conv2d(x, self.weight)
        output = self.bias_add(output, self.bias)
        return output