oneflow.nn.NLLLoss¶

class oneflow.nn.NLLLoss(weight: Optional[oneflow.Tensor] = None, ignore_index: int = - 100, reduction: str = 'mean')¶

The negative log likelihood loss. It is useful to train a classification problem with C classes.

The input given through a forward call is expected to contain log-probabilities of each class. input has to be a Tensor of size either \((minibatch, C)\) or \((minibatch, C, d_1, d_2, ..., d_K)\) with \(K \geq 1\) for the K-dimensional case (described later).

Obtaining log-probabilities in a neural network is easily achieved by adding a LogSoftmax layer in the last layer of your network. You may use CrossEntropyLoss instead, if you prefer not to add an extra layer.

The target that this loss expects should be a class index in the range \([0, C-1]\) where C = number of classes;

The unreduced (i.e. with reduction set to 'none') loss can be described as:

\[\ell(x, y) = L = \{l_1,\dots,l_N\}^\top, \quad l_n = - w_{y_n} x_{n,y_n}, \quad w_{c} = \mathbb{1},\]

where \(x\) is the input, \(y\) is the target, \(w\) is the weight, and \(N\) is the batch size. If reduction is not 'none' (default 'mean'), then

\[\begin{split}\ell(x, y) = \begin{cases} \sum_{n=1}^N \frac{1}{N} l_n, & \text{if reduction} = \text{`mean';}\\ \sum_{n=1}^N l_n, & \text{if reduction} = \text{`sum'.} \end{cases}\end{split}\]

Can also be used for higher dimension inputs, such as 2D images, by providing an input of size \((minibatch, C, d_1, d_2, ..., d_K)\) with \(K \geq 1\), where \(K\) is the number of dimensions, and a target of appropriate shape (see below). In the case of images, it computes NLL loss per-pixel.

Parameters: reduction (string, optional) – Specifies the reduction to apply to the output: 'none' | 'mean' | 'sum'. 'none': no reduction will be applied, 'mean': the weighted mean of the output is taken, 'sum': the output will be summed. Default: 'mean'

For example:

>>> import oneflow as flow
>>> import numpy as np

>>> input = flow.tensor(
... [[-0.1664078, -1.7256707, -0.14690138],
... [-0.21474946, 0.53737473, 0.99684894],
... [-1.135804, -0.50371903, 0.7645404]], dtype=flow.float32)
>>> target = flow.tensor(np.array([0, 1, 2]), dtype=flow.int32)
>>> m = flow.nn.NLLLoss(reduction="none")
>>> out = m(input, target)
>>> out
tensor([ 0.1664, -0.5374, -0.7645], dtype=oneflow.float32)

>>> m = flow.nn.NLLLoss(reduction="sum")
>>> out = m(input, target)
>>> out
tensor(-1.1355, dtype=oneflow.float32)

>>> m = flow.nn.NLLLoss(reduction="mean")
>>> out = m(input, target)
>>> out
tensor(-0.3785, dtype=oneflow.float32)