backward(tensors: Union[oneflow.Tensor, Sequence[oneflow.Tensor]], grad_tensors: Optional[Union[oneflow.Tensor, Sequence[oneflow.Tensor]]], retain_graph: bool = False, create_graph: bool = False) → None¶
Computes the sum of gradients of given tensors with respect to graph leaves.
The documentation is referenced from: https://pytorch.org/docs/1.10/generated/torch.autograd.backward.html.
The graph is differentiated using the chain rule. If any of
tensorsare non-scalar (i.e. their data has more than one element) and require gradient, then the Jacobian-vector product would be computed, in this case the function additionally requires specifying
grad_tensors. It should be a sequence of matching length, that contains the “vector” in the Jacobian-vector product, usually the gradient of the differentiated function w.r.t. corresponding tensors. (
Noneis an acceptable value for all tensors that don’t need gradient.)
This function accumulates gradients in the leaves - you might need to zero
.gradattributes or set them to
Nonebefore calling it.
Using this method with
create_graph=Truewill create a reference cycle between the parameter and its gradient which can cause a memory leak. We recommend using
autograd.gradwhen creating the graph to avoid this. If you have to use this function, make sure to reset the
.gradfields of your parameters to
Noneafter use to break the cycle and avoid the leak.
grad_tensors (Tensor or Sequence[Tensor], optional) – The “vector” in the Jacobian-vector product, usually gradients each element of corresponding tensors. (None values can be specified for scalar Tensors or ones that don’t require grad.)
retain_graph (bool, optional) – If
False, the graph used to compute the grads will be reset after backward is complete. Defaults to
False. Note that in nearly all cases setting this option to
Trueis not needed and often can be worked around in a much more efficient way. Defaults to the value of
create_graph (bool, optional) – If
True, graph of the derivative will be constructed, allowing to compute higher order derivative products. Defaults to