add pseudo_epochs feature + documentation (!58) · Merge requests · melissa / Melissa

CAULK Robert requested to merge add-pseudo_epochs into master Dec 01, 2022

After some discussion with Marc, we decided it would be nice to have a feature for a "pseudo" offline training in Melissa to help us make some exhibits in the documentation (specifically, for a nice online/offline heatpde example).

We realized the feature may be useful for users who want to migrate to melissa with their small-scale experiment. The first thing they may want to do is a quick validation of the melissa machinery to be confident that everything is connected and working before upscaling. This feature should let them do so.

And description from the doc:

The server has a non-default setting for small-scale prototype validation called pseudo_epochs(inside the dl_config) which changes the behavior of melissa from online training to pseudo-offline training. The goal of this setting is to provide users the ability to use melissa and the basic RandomQueueBuffer to aggregate all client samples before initiating training (similar to a true offline training). Further, the training loop will sample from the buffer to create pseudo_epochs worth of batches during training. It is important for the user to understand that this does not guarantee each point will be seen pseudo_epoch number of times, instead it means that the total number of batches will be equivalent to (num_samples * num_clients / batch_size) * pseudo_epochs. The point sampling will be from the buffer containing the full number of num_samples * num_clients points, but it will still employ the uniform random sampling (i.e. not all points will be seen an equal number of times). Users can activate this setting inside the dl_config with pseudo_epochs. By default, pseudo_epochs is set to 1 and does not change online buffer/training behaviors.

Here is an example comparison of the heat-pde online vs using this pseudo offline feature

Edited Dec 03, 2022 by CAULK Robert

Admin message

add pseudo_epochs feature + documentation

Merge request reports