mindspore.Parameter

class mindspore.Parameter(default_input, name=None, requires_grad=True, layerwise_parallel=False, parallel_optimizer=True)[source]

Parameter is a Tensor subclass, when they are assigned as Cell attributes they are automatically added to the list of its parameters, and will appear e.g. in cell.get_parameters() iterator.

Note

In auto_parallel mode of “semi_auto_parallel” and “auto_parallel”, if init Parameter by a Tensor, the type of Parameter will be Tensor. Tensor will save the shape and type info of a tensor with no memory usage. The shape can be changed while compiling for auto-parallel. Call init_data will return a Tensor Parameter with initialized data. If there is an operator in the network that requires part of the inputs to be Parameter, then the Parameters as this part of the inputs are not allowed to be cast. Give each Parameter a unique name to facilitate subsequent operations and updates. If there are two or more Parameter objects with the same name in a network, will be prompted to set a unique name when defining.

Parameters
  • default_input (Union[Tensor, int, float, numpy.ndarray, list]) – Parameter data, to initialize the parameter data.

  • name (str) –

    Name of the parameter. Default: None.

    1) If the parameter is not given a name, the default name is its variable name. For example, the name of param_a below is name_a, and the name of param_b is the variable name param_b.

    self.param_a = Parameter(Tensor([1], ms.float32), name="name_a")
    self.param_b = Parameter(Tensor([2], ms.float32))
    

    2) If parameter in list or tuple is not given a name, will give it a unique name. For example, the names of parameters below are Parameter$1 and Parameter$2.

    self.param_list = [Parameter(Tensor([3], ms.float32)),
                       Parameter(Tensor([4], ms.float32))]
    

    3) If the parameter is given a name, and the same name exists between different parameters, an exception will be thrown. For example, “its name ‘name_a’ already exists.” will be thrown.

    self.param_a = Parameter(Tensor([1], ms.float32), name="name_a")
    self.param_tuple = (Parameter(Tensor([5], ms.float32), name="name_a"),
                        Parameter(Tensor([6], ms.float32)))
    

    4) If a parameter appear multiple times in list or tuple, check the name of the object only once. For example, the following example will not throw an exception.

    self.param_a = Parameter(Tensor([1], ms.float32), name="name_a")
    self.param_tuple = (self.param_a, self.param_a)
    

  • requires_grad (bool) – True if the parameter requires gradient. Default: True.

  • layerwise_parallel (bool) – When layerwise_parallel is true in data/hybrid parallel mode, broadcast and gradients communication would not be applied to parameters. Default: False.

  • parallel_optimizer (bool) – It is used to filter the weight shard operation in semi auto or auto parallel mode. It works only when enable parallel optimizer in mindspore.context.set_auto_parallel_context(). Default: True.

Examples

>>> import numpy as np
>>> from mindspore import Parameter, Tensor
>>> import mindspore.ops as ops
>>> import mindspore.nn as nn
>>> import mindspore
>>>
>>> class Net(nn.Cell):
...     def __init__(self):
...         super(Net, self).__init__()
...         self.matmul = ops.MatMul()
...         self.weight = Parameter(Tensor(np.ones((1, 2)), mindspore.float32), name="w", requires_grad=True)
...
...     def construct(self, x):
...         out = self.matmul(self.weight, x)
...         return out
>>> net = Net()
>>> x = Tensor(np.ones((2, 1)), mindspore.float32)
>>> print(net(x))
[[2.]]
>>> net.weight.set_data(Tensor(np.zeros((1, 2)), mindspore.float32))
>>> print(net(x))
[[0.]]
property cache_enable

Return whether the parameter is cache enable.

property cache_shape

Return the cache shape corresponding to the parameter if use cache.

clone(init='same')[source]

Clone the parameter.

Parameters

init (Union[Tensor, str, numbers.Number]) – Initialize the shape and dtype of the parameter. If init is a Tensor or numbers.Number, clone a new parameter with the same shape and dtype, and the data of the new parameter will be set according to init. If init is a str, the init should be the alias of the class inheriting from Initializer. For example, if init is ‘same’, clone a new parameter with the same data, shape, and dtype. Default: ‘same’.

Returns

Parameter, a new parameter.

property comm_fusion

Get the fusion type (int) for communication operators corresponding to this parameter.

In AUTO_PARALLEL and SEMI_AUTO_PARALLEL mode, some communication operators used for parameters or gradients aggregation are inserted automatically. The value of fusion must be greater than or equal to 0. When the value of fusion is 0, operators will not be fused together.

property data

Return the parameter object.

init_data(layout=None, set_sliced=False)[source]

Initialize the parameter’s data.

Parameters
  • layout (Union[None, tuple]) –

    The parameter’s layout info. layout [dev_mat, tensor_map, slice_shape, filed_size, uniform_split, opt_shard_group]. Default: None. It’s not None only in ‘SEMI_AUTO_PARALLEL’ or ‘AUTO_PARALLEL’ mode.

    • dev_mat (list(int)): The parameter’s device matrix.

    • tensor_map (list(int)): The parameter’s tensor map.

    • slice_shape (list(int)): The parameter’s slice shape.

    • filed_size (int): The parameter’s filed size.

    • uniform_split (bool): Whether the parameter is split evenly.

    • opt_shard_group (str): The group of the parameter while running optimizer parallel.

  • set_sliced (bool) – True if the parameter is set sliced after initializing the data. Default: False.

Raises
  • RuntimeError – If it is from Initializer, and parallel mode has changed after the Initializer created.

  • ValueError – If the length of the layout is less than 6.

  • TypeError – If layout is not tuple.

Returns

Parameter, the Parameter after initializing data. If current Parameter was already initialized before, returns the same initialized Parameter.

property inited_param

Get the new parameter after call the init_data.

Default is a None, If self is a Parameter without data, after call the init_data the initialized Parameter with data will be recorded here.

property key

Return the parameter unique key.

property layerwise_parallel

Get the layerwise parallel status(bool) of the parameter.

When layerwise_parallel is true in DATA_PARALLEL and HYBRID_PARALLEL parallel mode, broadcast and gradients communication would not be applied to parameters.

property name

Get the name of the parameter.

property parallel_optimizer

Get the optimizer parallel status(bool) of the parameter.

It is used to filter the weight shard operation in AUTO_PARALLEL and SEMI_AUTO_PARALLEL mode. It works only when enable parallel optimizer in mindspore.context.set_auto_parallel_context().

property parallel_optimizer_comm_recompute

Get the communication recompute status(bool) of optimizer parallel for the parameter.

In AUTO_PARALLEL and SEMI_AUTO_PARALLEL mode, when applying parallel optimizer, some AllGather operators used for parameters gathering are inserted automatically. It is used to control the recompute attr for those AllGather operators.

Note

  • Only Graph mode is supported.

  • It is recommended to use cell.recompute(parallel_optimizer_comm_recompute=True/False) to configure the AllGather operators introducing by parallel optimizer rather than using this interface directly.

property requires_grad

Return whether the parameter requires gradient.

The main function of requires_grad is to tell auto grad to start recording operations on a Tensor. If a Tensor has requires_grad=False, then Tensor requires_grad will make auto grad start recording operations on the tensor.

set_data(data, slice_shape=False)[source]

Set Parameter’s data.

Parameters
  • data (Union[Tensor, int, float]) – New data.

  • slice_shape (bool) – If slice the parameter is set to true, the shape is not checked for consistency. Default: False.

Returns

Parameter, the parameter after set data.

set_param_fl(push_to_server=False, pull_from_server=False, requires_aggr=True)[source]

Set the way of parameter and server interaction.

Parameters
  • push_to_server (bool) – Whether the parameter should be pushed to server. Default: False.

  • pull_from_server (bool) – Whether the parameter should be pulled from server. Default: False.

  • requires_aggr (bool) – Whether the parameter should be aggregated in the server. Default: True.

set_param_ps(init_in_server=False)[source]

Set whether the trainable parameter is updated by parameter server and whether the trainable parameter is initialized on server.

Note

It only works when a running task is in the parameter server mode.

Parameters

init_in_server (bool) – Whether trainable parameter updated by parameter server is initialized on server. Default: False.

property sliced

Get slice status of the parameter.

property unique

Whether the parameter is already unique or not.