oneflow.autograd.grad¶
-
oneflow.autograd.grad(outputs: Union[oneflow.Tensor, Sequence[oneflow.Tensor]], inputs: Union[oneflow.Tensor, Sequence[oneflow.Tensor]], grad_outputs: Optional[Union[oneflow.Tensor, Sequence[oneflow.Tensor]]] = None, retain_graph: bool = False, create_graph: bool = False, allow_unused: bool = False) → Tuple[oneflow.Tensor]¶ Computes and returns the sum of gradients of outputs with respect to the inputs.
The documentation is referenced from: https://pytorch.org/docs/1.10/generated/torch.autograd.grad.html.
The graph is differentiated using the chain rule.
grad_outputsshould be a sequence of length matchingoutputs, containing the “vector” in the Jacobian-vector product. (Noneis an acceptable value for that tensor don’t require gradient.)- Parameters
outputs (Sequence[Tensor]) – Tensors of which the derivative will be computed.
inputs (Sequence[Tensor]) – Inputs w.r.t. which the derivative will be returned(and not accumulated into
.grad).grad_outputs (Sequence[Tensor], optional) – The “vector” in the Jacobian-vector product. Usually gradients w.r.t. each output. None values can be specified for scalar Tensors or ones that don’t require grad. Defaults to None.
retain_graph (bool, optional) – If
False, the graph used to compute the grads will be reset after backward is complete. Defaults toFalse. Note that in nearly all cases setting this option toTrueis not needed and often can be worked around in a much more efficient way. Defaults to the value ofcreate_graph.create_graph (bool, optional) – If
True, graph of the derivative will be constructed, allowing to compute higher order derivative products. Defaults toFalse.allow_unused (bool, optional) – If
False, specifying inputs that were not used when computing outputs (and therefore their grad is always zero) is an error. Defaults toFalse.
- Returns
A tuple of tensors containing the gradients for each
inputs.- Return type
Tuple(Tensor)