See CrossEntropyLoss for details.

The documentation is referenced from:

  • input (Tensor) – \((N, C)\) where C = number of classes or \((N, C, H, W)\) in case of 2D Loss, or \((N, C, d_1, d_2, ..., d_K)\) where \(K \geq 1\) in the case of K-dimensional loss. input is expected to contain unnormalized scores (often referred to as logits).

  • target (Tensor) – If containing class indices, shape \((N)\) where each value is \(0 \leq \text{targets}[i] \leq C-1\), or \((N, d_1, d_2, ..., d_K)\) with \(K \geq 1\) in the case of K-dimensional loss. If containing class probabilities, same shape as the input.

  • weight (Tensor, optional) – a manual rescaling weight given to each class. If given, has to be a Tensor of size C

  • ignore_index (int, optional) – Specifies a target value that is ignored and does not contribute to the input gradient. When size_average is True, the loss is averaged over non-ignored targets. Note that ignore_index is only applicable when the target contains class indices. Default: -100

  • reduction (string, optional) – Specifies the reduction to apply to the output: 'none' | 'mean' | 'sum'. 'none': no reduction will be applied, 'mean': the sum of the output will be divided by the number of elements in the output, 'sum': the output will be summed. Note: size_average and reduce are in the process of being deprecated, and in the meantime, specifying either of those two args will override reduction. Default: 'mean'

For example:

>>> import oneflow as flow
>>> import oneflow.nn.functional as F
>>> input = flow.randn(3, 5, requires_grad=True)
>>> target = flow.ones(3, dtype=flow.int64)
>>> loss = F.cross_entropy(input, target)
>>> loss.backward()