mindspore.mint.nn.LayerNorm
- class mindspore.mint.nn.LayerNorm(normalized_shape, eps=1e-5, elementwise_affine=True, bias=True, dtype=None)[source]
Layer normalization of the input mini-batch.
Layer Normalization applies normalization on a mini-batch of inputs for each single training case as described in the paper Layer Normalization.
Unlike Batch Normalization and Instance Normalization, which applies scalar scale and bias for each entire channel/plane with the affine option, Layer Normalization applies per-element scale and bias with elementwise_affine. \(\gamma\) is the scale value learned through training and \(\beta\) is the shift value. It can be described using the following formula:
\[y = \frac{x - \mathrm{E}[x]}{\sqrt{\mathrm{Var}[x] + \epsilon}} * \gamma + \beta\]Warning
This is an experimental API that is subject to change or deletion.
- Parameters
normalized_shape (Union(tuple[int], list[int], int)) – The normalized shape of x for LayerNorm.
eps (float, optional) – A value added to the denominator for numerical stability( \(\epsilon\) ). Default
1e-5.elementwise_affine (bool, optional) – Whether affine transformation is required. A boolean value that when set to True, this module has learnable per-element affine parameters initialized to ones (for weights) and zeros (for biases). Default
True.bias (bool, optional) – If set to
False, the layer will not learn an additive bias (only relevant if elementwise_affine isTrue). DefaultTrue.dtype (
mindspore.dtype, optional) – Dtype of Parameters. DefaultNone.
- Inputs:
x (Tensor) - The last several dimensions of x should match the normalized_shape. Its shape is generally \((N, ..., *normalized\_shape)\).
- Outputs:
output (Tensor) - the normalized and scaled offset tensor, has the same shape and data type as the x.
- Supported Platforms:
Ascend
Examples
>>> import mindspore >>> import numpy as np >>> x = mindspore.Tensor(np.ones([20, 5, 10, 10]), mindspore.float32) >>> shape1 = x.shape[1:] >>> m = mindspore.mint.nn.LayerNorm(shape1) >>> output = m(x) >>> print(output.shape) (20, 5, 10, 10)