Explore implementation of `num_updates` in scikit-learn with a one-by-one sample approach

As suggested by @paandrey:

This is indeed a limitation with scikit-learn, that occurs in most cases since most scikit-learn models run are designed to handle the entire training procedure based on a single fit call. This can however be avoided the specific case of SgdClassifier / SgdRegressor classes, because it is possible to set a constant arbitrary learning rate, feed the partial_fit method with a single data sample at a time, and reset its weights after each call. This way, it is possible to compute sample-wise, and therefore batch-wise, gradients. This is what we do in declearn, so that the same optimizer code (and plug-in system) can be used with these models as with torch and tensorflow.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information

Admin message

Explore implementation of `num_updates` in scikit-learn with a one-by-one sample approach