Undoing n_epochs to num_batch conversion
MR description
When Scaffold branch has been merged, we introduce a regression by converting n_epochs
the number of epochs into num_updates
the number of updates, in order to accordingly computes the number of updates needed in Scaffold Aggregator
. Thus, each nodes would have performed the same number of updates, regardless of the size of each node dataset. We thought that way, Scaffold
Aggregator
computation will always be correct. Nonetheless, this feature breaks the notion of epoch and introduces new bugs, such as not handling correctly the size of MedicalFolderDataset
datasets.
In the following MR, we will undo this way of doing, and get back to our previous notion of epoch (before merging Scaffold branch), and trigger error if num_updates
is missing when using Scaffold
Aggregator