Add a Scheduler API to enable time-based learning rate (and weight decay) adjustements (!66) · Merge requests · Magnet / DecLearn / declearn2

ANDREY Paul requested to merge lrate-schedulers into develop May 24, 2024

This MR adds a long-awaited feature: scheduling the learning rate (and/or weight decay factor) to take different values through time, based on the number of training steps and/or rounds already taken.

This takes the form of a new (and extensible) Scheduler API, implemented under the new declearn.optimizer.schedulers submodule. Instances of Scheduler subclasses (or their JSON-serializable specs) may be passed to Optimizer.__init__ instead of float values to specify the lrate and/or w_decay parameters, resulting in time-varying values being computed and used rather than a constant one.

This MR implements the base API, integrates it to the Optimizer one, updates and add dedicated unit tests, and implements a number of standard scheduling rules. The latter include:

Various kinds of decay (step, multi-steps or round based; linear, exponential, polynomial...)
Cyclic learning rates (based on this or that paper).
Linear warmup (steps or round based; combinable with another scheduler to use after the warmup period)

Closes #5 (closed)

Admin message

Add a Scheduler API to enable time-based learning rate (and weight decay) adjustements

Merge request reports