Mentions légales du service

Skip to content

Adding tentative nesterov plus doc

BIGAUD Nathan requested to merge nesterov into main

Quite unsure about this implementaion, which is a loose mix of three implementations:

  • Our existing implementation of the basic momentum, which seems inspired by torch in that we factorize all by the learning rate, but also by something else with our use of (1-beta)
  • The torch implementation, but I am not sure to properly understand the code
  • The tf implementation, with clearer doc but a different formula

For ref :

Merge request reports