Fed-BioMed 's `data_loader`s and `batch_size` attribute issue

This issue is a related to a fix (MR !162 (merged)) that aimed to solve batch_size attribute call of training_data_loader(in torchnn) when using Opacus that may not be optimal, and that may need some further discussion / refactoring

Issue description Opacus 's data_loader attribute batch_size returns None instead of returning the actual batch_size, which triggered a bug when trying to compute epoch percentage completeness (sent through logs)

Merged fix In the fix (MR !162 (merged) ), we are accessing the batch size using a if/else statement:

if data used for training is a dictionary, batch_size is the length of the first item in the dictionary , ie batch_size = len(list(data.values())[0])
else we assume data object is a list or Tensor like object and wrote batch_size = len(data)

Issue with the current fix We are making data_loader less generic, and are hindering the fact that data_loader could be used in a generic way, ie returns data in any format. If in the future data has another format/type, we will have to specify how to compute its batch_size wrt of the data/type in the TorchTrainingPlan

Idea for solving issue NB: we cannot set or change Opacus attributes since there are protected (investigations are needed to understand how they are protected)

create a wrapper of training_data_loader
create a BatchSize class

Edited Dec 12, 2022 by BOUILLARD Yannick

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information

Admin message

Fed-BioMed 's `data_loader`s and `batch_size` attribute issue