oneflow.nn.utils.clip_grad_norm_¶

oneflow.nn.utils.clip_grad_norm_(parameters: Union[oneflow.Tensor, Iterable[oneflow.Tensor]], max_norm: float, norm_type: float = 2.0, fused: bool = False, error_if_nonfinite: bool = False) → oneflow.Tensor ¶

Clips gradient norm of an iterable of parameters. The norm is computed over all gradients together, as if they were concatenated into a single vector.

Parameters

parameters (Iterable[Tensor] or Tensor) – an iterable of Tensors or a single Tensor that will have gradients normalized
max_norm (float or int) – max norm of the gradients
norm_type (float or int) – type of the used p-norm. Can be 'inf' for infinity norm.
error_if_nonfinite (bool) – if True, an error is thrown if the total norm of the gradients from :attr:parameters is nan, inf, or -inf. Default: False (will switch to True in the future)

Returns

Parameters after cliping gradient norm Total norm of the parameters (viewed as a single vector).

For example:

>>> import oneflow as flow
>>> import numpy as np
>>> x1 = flow.tensor(np.array([[2, 3, 4], [1.5, 2.6, 3.7]]).astype(np.float32), requires_grad=True)
>>> m1 = flow.nn.ReLU()
>>> out1 = m1(x1)
>>> out1 = out1.sum()
>>> out1.backward()
>>> norm1 = flow.nn.utils.clip_grad_norm_(x1, 0.6, 1.0)
>>> norm1
tensor(6., dtype=oneflow.float32)
>>> x1.grad
tensor([[0.1000, 0.1000, 0.1000],
        [0.1000, 0.1000, 0.1000]], dtype=oneflow.float32)
>>> x2 = flow.tensor(np.array([[-2, -3, -4], [2.5, 0, 3.2]]).astype(np.float32), requires_grad=True)
>>> out2 = flow.atan(x2)
>>> out2 = out2.sum()
>>> out2.backward()
>>> norm2 = flow.nn.utils.clip_grad_norm_(x2, 0.5)
>>> norm2
tensor(1.0394, dtype=oneflow.float32)
>>> x2.grad
tensor([[0.0962, 0.0481, 0.0283],
        [0.0663, 0.4810, 0.0428]], dtype=oneflow.float32)