oneflow.nn.utils.clip_grad_norm_¶
-
oneflow.nn.utils.
clip_grad_norm_
(parameters: Union[oneflow.Tensor, Iterable[oneflow.Tensor]], max_norm: float, norm_type: float = 2.0, fused: bool = False, error_if_nonfinite: bool = False) → oneflow.Tensor¶ Clips gradient norm of an iterable of parameters. The norm is computed over all gradients together, as if they were concatenated into a single vector.
- Parameters
parameters (Iterable[Tensor] or Tensor) – an iterable of Tensors or a single Tensor that will have gradients normalized
max_norm (float or int) – max norm of the gradients
norm_type (float or int) – type of the used p-norm. Can be
'inf'
for infinity norm.error_if_nonfinite (bool) – if True, an error is thrown if the total norm of the gradients from :attr:
parameters
isnan
,inf
, or-inf
. Default: False (will switch to True in the future)
- Returns
Parameters after cliping gradient norm Total norm of the parameters (viewed as a single vector).
For example:
>>> import oneflow as flow >>> import numpy as np >>> x1 = flow.tensor(np.array([[2, 3, 4], [1.5, 2.6, 3.7]]).astype(np.float32), requires_grad=True) >>> m1 = flow.nn.ReLU() >>> out1 = m1(x1) >>> out1 = out1.sum() >>> out1.backward() >>> norm1 = flow.nn.utils.clip_grad_norm_(x1, 0.6, 1.0) >>> norm1 tensor(6., dtype=oneflow.float32) >>> x1.grad tensor([[0.1000, 0.1000, 0.1000], [0.1000, 0.1000, 0.1000]], dtype=oneflow.float32) >>> x2 = flow.tensor(np.array([[-2, -3, -4], [2.5, 0, 3.2]]).astype(np.float32), requires_grad=True) >>> out2 = flow.atan(x2) >>> out2 = out2.sum() >>> out2.backward() >>> norm2 = flow.nn.utils.clip_grad_norm_(x2, 0.5) >>> norm2 tensor(1.0394, dtype=oneflow.float32) >>> x2.grad tensor([[0.0962, 0.0481, 0.0283], [0.0663, 0.4810, 0.0428]], dtype=oneflow.float32)