mixmo.core.scheduler.GradualWarmupScheduler¶

class mixmo.core.scheduler.GradualWarmupScheduler(optimizer, multiplier, total_steps)[source]¶

Bases: torch.optim.lr_scheduler._LRScheduler

Gradually warm-up(increasing) learning rate in optimizer. Proposed in ‘Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour’. :param optimizer: Wrapped optimizer. :type optimizer: Optimizer :param multiplier: target learning rate = base lr * multiplier if multiplier > 1.0. if multiplier = 1.0, lr starts from 0 and ends up with the base_lr. :param total_steps: target learning rate is reached at total_steps, gradually

__init__(optimizer, multiplier, total_steps)[source]¶: Initialize self. See help(type(self)) for accurate signature.

Methods

`__init__`(optimizer, multiplier, total_steps)	Initialize self.
`get_last_lr`()	Return last computed learning rate by current scheduler.
`get_lr`()
`get_lr_warmup`()
`load_state_dict`(state_dict)	Loads the schedulers state.
`state_dict`()	Returns the state of the scheduler as a `dict`.
`step`([steps])