Mentions légales du service

Skip to content

Revise `Optimizer` and add state-access methods.

BIGAUD Nathan requested to merge stateful-optimizer into main

Main idea :

The overall goal was to be able to checkpoint optimizers between rounds. In the process, we reviewed the existing Checkpointer class. This MR has eventually been cut in two parts:

  • the current MR retains only the Optimizer-related revisions
  • another MR is being created for the Checkpointer-related revisions

This MR now performs the following changes:

  • Add a get_state and set_state pair of methods to OptiModule and Optimizer that enable accessing internal state variables.
  • Add the Optimizer.start_round method, that triggers wrapped Regularizer.start_round ones. Call it in TrainingManager.training_round.
  • Garbage-collect OptiModule.(de)serialize methods, that have been made obsolete by OptiModule.from_specs and revisions to config methods.
  • Fix YogiModule, that had previously been regressed into not deferring from its parent AdamModule.
  • Perform an overall clean-up of the Optimizer and OptiModule classes and of their docstrings.
  • Implement unit tests for Optimizer.
  • Update unit tests for OptiModule classes.

Legacy to-do for the initial implementation by @nbigaud:

  • Create the abstract method for optimodule
  • select the list of actual variables to be saved as part of this
  • Implement it for the full list
    • _adaptative.py : AdaGradModule.state, AdamModule.steps, AdamModule._vmax
    • _momentum.py : MomentumModule.velocity, EWMAModule.state (and YogiMomentumModule inherits)
  • Create Optimizer-level methods to get and set state using json, which just stacks calls to the state method from modules
  • Merge recent main branch changes
Edited by ANDREY Paul

Merge request reports