oneflow.optim.lr_scheduler.CosineDecayLR¶
-
class
oneflow.optim.lr_scheduler.CosineDecayLR(optimizer: oneflow.optim.optimizer.Optimizer, decay_steps: int, alpha: float = 0.0, last_step: int = - 1, verbose: bool = False)¶ This operator creates a Cosine decayed learning rate scheduler.
Before the decay_steps are specified by user, the learning rate will be updated as:
\[ \begin{align}\begin{aligned}& cos\_decay = 0.5*(1+cos(\pi*\frac{current\_step}{decay\_steps}))\\& decay\_factor = (1-\alpha)*cos\_decay+\alpha\\& learning\_rate = base\_learning\_rate*decay\_factor\end{aligned}\end{align} \]After the decay_steps specified by user, the learning rate will be :
\[learning\_rate = {base\_learning\_rate}*{\alpha}\]It has been proposed in SGDR: Stochastic Gradient Descent with Warm Restarts. Note that this only implements the cosine annealing part of SGDR, and not the restarts.
- Parameters
optimizer (Optimizer) – Wrapped optimizer.
decay_steps (int) – The decay steps in the scheduler.
alpha (float, optional) – The learning rate scale factor (\(\alpha\)). (default: 0.0)
last_step (int, optional) – The index of last step. (default: -1)
verbose (bool, optional) – If
True, prints a message to stdout for each update. (default:False)
For example:
import oneflow as flow ... cosine_decay_lr = flow.optim.lr_scheduler.CosineDecayLR(optimizer, decay_steps=100, alpha=0.0) for epoch in range(num_epoch): train(...) cosine_decay_lr.step()
-
__init__(optimizer: oneflow.optim.optimizer.Optimizer, decay_steps: int, alpha: float = 0.0, last_step: int = - 1, verbose: bool = False)¶ Initialize self. See help(type(self)) for accurate signature.
Methods
__delattr__(name, /)Implement delattr(self, name).
__dir__()Default dir() implementation.
__eq__(value, /)Return self==value.
__format__(format_spec, /)Default object formatter.
__ge__(value, /)Return self>=value.
__getattribute__(name, /)Return getattr(self, name).
__gt__(value, /)Return self>value.
__hash__()Return hash(self).
__init__(optimizer, decay_steps[, alpha, …])Initialize self.
__init_subclass__This method is called when a class is subclassed.
__le__(value, /)Return self<=value.
__lt__(value, /)Return self<value.
__ne__(value, /)Return self!=value.
__new__(**kwargs)Create and return a new object.
__reduce__()Helper for pickle.
__reduce_ex__(protocol, /)Helper for pickle.
__repr__()Return repr(self).
__setattr__(name, value, /)Implement setattr(self, name, value).
__sizeof__()Size of object in memory, in bytes.
__str__()Return str(self).
__subclasshook__Abstract classes can override this to customize issubclass().
_generate_conf_for_graph(lr_conf)_init_base_lrs()get_last_lr()Return last computed learning rate by current scheduler.
get_lr(base_lr, step)Compute learning rate using chainable form of the scheduler
load_state_dict(state_dict)Load the schedulers state.
print_lr(group, lr)Display the current learning rate.
state_dict()Return the state of the scheduler as a
dict.step()update_lrs(lrs)