mixmo.core.scheduler.GradualWarmupScheduler

class mixmo.core.scheduler.GradualWarmupScheduler(optimizer, multiplier, total_steps)[source]

Bases: torch.optim.lr_scheduler._LRScheduler

Gradually warm-up(increasing) learning rate in optimizer. Proposed in ‘Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour’. :param optimizer: Wrapped optimizer. :type optimizer: Optimizer :param multiplier: target learning rate = base lr * multiplier if multiplier > 1.0. if multiplier = 1.0, lr starts from 0 and ends up with the base_lr. :param total_steps: target learning rate is reached at total_steps, gradually

__init__(optimizer, multiplier, total_steps)[source]

Initialize self. See help(type(self)) for accurate signature.

Methods

__init__(optimizer, multiplier, total_steps)

Initialize self.

get_last_lr()

Return last computed learning rate by current scheduler.

get_lr()

get_lr_warmup()

load_state_dict(state_dict)

Loads the schedulers state.

state_dict()

Returns the state of the scheduler as a dict.

step([steps])