mindspore.mint.nn.LayerNorm

class mindspore.mint.nn.LayerNorm(normalized_shape, eps=1e-5, elementwise_affine=True, bias=True, dtype=None)[source]

Layer normalization of the input mini-batch.

Layer Normalization applies normalization on a mini-batch of inputs for each single training case as described in the paper Layer Normalization.

Unlike Batch Normalization and Instance Normalization, which applies scalar scale and bias for each entire channel/plane with the affine option, Layer Normalization applies per-element scale and bias with elementwise_affine. \(\gamma\) is the scale value learned through training and \(\beta\) is the shift value. It can be described using the following formula:

\[y = \frac{x - \mathrm{E}[x]}{\sqrt{\mathrm{Var}[x] + \epsilon}} * \gamma + \beta\]

Warning

This is an experimental API that is subject to change or deletion.

Parameters

normalized_shape (Union(tuple[int], list[int], int)) – The normalized shape of x for LayerNorm.
eps (float, optional) – A value added to the denominator for numerical stability( \(\epsilon\) ). Default 1e-5 .
elementwise_affine (bool, optional) – Whether affine transformation is required. A boolean value that when set to True, this module has learnable per-element affine parameters initialized to ones (for weights) and zeros (for biases). Default True.
bias (bool, optional) – If set to False, the layer will not learn an additive bias (only relevant if elementwise_affine is True). Default True.
dtype (mindspore.dtype, optional) – Dtype of Parameters. Default None .

Inputs:

x (Tensor) - The last several dimensions of x should match the normalized_shape. Its shape is generally \((N, ..., *normalized\_shape)\).

Outputs:

output (Tensor) - the normalized and scaled offset tensor, has the same shape and data type as the x.

Supported Platforms:

Ascend

Examples

>>> import mindspore
>>> import numpy as np
>>> x = mindspore.Tensor(np.ones([20, 5, 10, 10]), mindspore.float32)
>>> shape1 = x.shape[1:]
>>> m = mindspore.mint.nn.LayerNorm(shape1)
>>> output = m(x)
>>> print(output.shape)
(20, 5, 10, 10)