oneflow.optim.lr_scheduler.CosineAnnealingWarmRestarts¶
-
class
oneflow.optim.lr_scheduler.
CosineAnnealingWarmRestarts
(optimizer: oneflow.optim.optimizer.Optimizer, T_0: int, T_mult: int = 1, eta_min: float = 0.0, decay_rate: float = 1.0, restart_limit: int = 0, last_step: int = - 1, verbose: bool = False)¶ Set the learning rate of each parameter group using a cosine annealing schedule, where \(\eta_{max}\) is set to the initial lr, \(T_{cur}\) is the number of steps since the last restart and \(T_{i}\) is the number of steps between two warm restarts in SGDR:
\[\eta_t = \eta_{min} + \frac{1}{2}(\eta_{max} - \eta_{min})\left(1 + \cos\left(\frac{T_{cur}}{T_{i}}\pi\right)\right)\]When \(T_{cur}=T_{i}\), set \(\eta_t = \eta_{min}\). When \(T_{cur}=0\) after restart, set \(\eta_t=\eta_{max}\).
It has been proposed in SGDR: Stochastic Gradient Descent with Warm Restarts.
- Parameters
optimizer (Optimizer) – Wrapped optimizer.
T_0 (int) – Number of iterations for the first restart.
T_mult (int, optional) – A factor increases \(T_{i}\) after a restart. Default: 1.
eta_min (float, optional) – Minimum learning rate. Default: 0.
decay_rate (float, optional) – Decay rate every restarts.
restart_limit (int, optional) – The limit of restarts. 0 indicate unlimited restarts. Default: 0.
last_step (int, optional) – The index of last step. Default: -1.
verbose (bool) – If
True
, prints a message to stdout for each update. Default:False
.
-
__init__
(optimizer: oneflow.optim.optimizer.Optimizer, T_0: int, T_mult: int = 1, eta_min: float = 0.0, decay_rate: float = 1.0, restart_limit: int = 0, last_step: int = - 1, verbose: bool = False)¶ Initialize self. See help(type(self)) for accurate signature.
Methods
__delattr__
(name, /)Implement delattr(self, name).
__dir__
()Default dir() implementation.
__eq__
(value, /)Return self==value.
__format__
(format_spec, /)Default object formatter.
__ge__
(value, /)Return self>=value.
__getattribute__
(name, /)Return getattr(self, name).
__gt__
(value, /)Return self>value.
__hash__
()Return hash(self).
__init__
(optimizer, T_0[, T_mult, eta_min, …])Initialize self.
__init_subclass__
This method is called when a class is subclassed.
__le__
(value, /)Return self<=value.
__lt__
(value, /)Return self<value.
__ne__
(value, /)Return self!=value.
__new__
(**kwargs)Create and return a new object.
__reduce__
()Helper for pickle.
__reduce_ex__
(protocol, /)Helper for pickle.
__repr__
()Return repr(self).
__setattr__
(name, value, /)Implement setattr(self, name, value).
__sizeof__
()Size of object in memory, in bytes.
__str__
()Return str(self).
__subclasshook__
Abstract classes can override this to customize issubclass().
_generate_conf_for_graph
(lr_conf)_init_base_lrs
()get_last_lr
()Return last computed learning rate by current scheduler.
get_lr
(base_lr, step)Compute learning rate using chainable form of the scheduler
load_state_dict
(state_dict)Load the schedulers state.
print_lr
(group, lr)Display the current learning rate.
state_dict
()Return the state of the scheduler as a
dict
.step
()update_lrs
(lrs)