mindspore.nn.CrossEntropyLoss

class mindspore.nn.CrossEntropyLoss(weight=None, ignore_index=- 100, reduction='mean', label_smoothing=0.0)[source]

The cross entropy loss between input and target.

Warning

After version 2.9.0, the forward inputs logits and labels will be renamed to input and target.

The CrossEntropyLoss support two kind of targets:

Class indices (int) in the range \([0, C)\) where \(C\) is the number of classes, when reduction is none, the loss can be described as:

\[\ell(x, y) = L = \{l_1,\dots,l_N\}^\top, \quad l_n = - w_{y_n} \log \frac{\exp(x_{n,y_n})}{\sum_{c=1}^C \exp(x_{n,c})} \cdot \mathbb{1}\{y_n \not= \text{ignore_index}\}\]

where \(x\) is the inputs, \(t\) is the target, \(w\) is the weight, N is the batch size, \(c\) belonging to [0, C-1] is class index, where \(C\) is the number of classes.

If reduction is not 'none' (default 'mean'), then

\[\begin{split}\ell(x, y) = \begin{cases} \sum_{n=1}^N \frac{1}{\sum_{n=1}^N w_{y_n} \cdot \mathbb{1}\{y_n \not= \text{ignore_index}\}} l_n, & \text{if reduction} = \text{'mean',}\\ \sum_{n=1}^N l_n, & \text{if reduction} = \text{'sum'.} \end{cases}\end{split}\]
Probabilities (float) for each class, useful when labels beyond a single class per minibatch item are required, the loss with reduction=none can be described as:

\[\ell(x, y) = L = \{l_1,\dots,l_N\}^\top, \quad l_n = - \sum_{c=1}^C w_c \log \frac{\exp(x_{n,c})}{\sum_{i=1}^C \exp(x_{n,i})} y_{n,c}\]

where \(x\) is the inputs, \(t\) is the target, \(w\) is the weight, N is the batch size, \(c\) belonging to [0, C-1] is class index, where \(C\) is the number of classes.

If reduction is not 'none' (default 'mean'), then

\[\begin{split}\ell(x, y) = \begin{cases} \frac{\sum_{n=1}^N l_n}{N}, & \text{if reduction} = \text{'mean',}\\ \sum_{n=1}^N l_n, & \text{if reduction} = \text{'sum'.} \end{cases}\end{split}\]

Parameters

weight (Tensor) – The rescaling weight to each class. If the value is not None, the shape is \((C,)\). The data type only supports float32 or float16. Default: None .
ignore_index (int) – Specifies a target value that is ignored (typically for padding value) and does not contribute to the gradient. Default: -100 .
reduction (str, optional) –
Apply specific reduction method to the output: 'none' , 'mean' , 'sum' . Default: 'mean' .
- 'none': no reduction will be applied.
- 'mean': compute and return the weighted mean of elements in the output.
- 'sum': the output elements will be summed.
label_smoothing (float) – Label smoothing values, a regularization tool used to prevent the model from overfitting when calculating Loss. The value range is [0.0, 1.0]. Default value: 0.0 .

Inputs:

logits (Tensor) - Tensor of shape \((C,)\) \((N, C)\) or \((N, C, d_1, d_2, ..., d_K)\), where C = number of classes. Data type must be float16 or float32.
labels (Tensor) - For class indices, tensor of shape \(()\), \((N)\) or \((N, d_1, d_2, ..., d_K)\) , data type must be int32. For probabilities, tensor of shape \((C,)\) \((N, C)\) or \((N, C, d_1, d_2, ..., d_K)\) , data type must be float16 or float32.

Returns

Tensor, the computed cross entropy loss value.

Raises

TypeError – If weight is not a Tensor.
TypeError – If ignore_index is not an int.
TypeError – If the data type of weight is not float16 or float32.
ValueError – If reduction is not one of 'none', 'mean', 'sum'.
TypeError – If label_smoothing is not a float.
TypeError – If logits is not a Tensor.
TypeError – If labels is not a Tensor.

Supported Platforms:: Ascend GPU CPU

Examples

>>> import mindspore as ms
>>> import mindspore.nn as nn
>>> import numpy as np
>>> # Case 1: Indices labels
>>> inputs = ms.Tensor(np.random.randn(3, 5), ms.float32)
>>> target = ms.Tensor(np.array([1, 0, 4]), ms.int32)
>>> loss = nn.CrossEntropyLoss()
>>> output = loss(inputs, target)
>>> # Case 2: Probability labels
>>> inputs = ms.Tensor(np.random.randn(3, 5), ms.float32)
>>> target = ms.Tensor(np.random.randn(3, 5), ms.float32)
>>> loss = nn.CrossEntropyLoss()
>>> output = loss(inputs, target)