oneflow.autograd provides classes and functions implementing automatic differentiation of arbitrary scalar valued functions. It requires minimal changes to the existing code - you only need to declare Tensor s for which gradients should be computed with the requires_grad=True keyword. As of now, we only support autograd for floating point Tensor types ( half, float, double and bfloat16).


Computes the sum of gradients of given tensors with respect to graph leaves.


Computes and returns the sum of gradients of outputs with respect to the inputs.

Locally disabling gradient computation


Context-manager that disabled gradient calculation.


Context-manager that enabled gradient calculation.


Context-manager that enabled gradient calculation.


Context-manager that enables or disables inference mode

In-place operations on Tensors

Supporting in-place operations in autograd is a hard matter, and we discourage their use in most cases. Autograd’s aggressive buffer freeing and reuse makes it very efficient and there are very few occasions when in-place operations actually lower memory usage by any significant amount. Unless you’re operating under heavy memory pressure, you might never need to use them.

Tensor autograd functions


Return the gradient calculated by autograd functions.


Is True if gradient need to be computed for this Tensor, False otherwise.


All Tensors that have requires_grad which is False will be leaf Tensors by convention.

oneflow.Tensor.backward([gradient, …])

Computes the gradient of current tensor w.r.t. graph leaves.



Registers a backward hook.


Enables this Tensor to have their grad populated during backward().


class oneflow.autograd.Function(self)

Base class to create custom autograd.Function.

To create a custom autograd.Function, subclass this class and implement the forward() and backward() static methods. Then, to use your custom op in the forward pass, call the class method apply() or __call__(). Do not call forward() directly.

For example:

class Exp(Function):
    def forward(ctx, i):
        result = i.exp()
        return result

    def backward(ctx, grad_output):
        result, = ctx.saved_tensors
        return grad_output * result

# Use it by calling the apply method or __call__ method
output = Exp.apply(input)  # output = Exp()(input)


Override this function for custom forward calculation.


Override this function for custom backward calculation.


Calculate output tensors and build backward graph.

Context method mixins

When creating a new Function, the following methods are available to ctx.