Fed-BioMed 's `data_loader`s and `batch_size` attribute issue
This issue is a related to a fix (MR !162 (merged)) that aimed to solve batch_size
attribute call of training_data_loader
(in torchnn
) when using Opacus that may not be optimal, and that may need some further discussion / refactoring
Issue description
Opacus 's data_loader
attribute batch_size
returns None
instead of returning the actual batch_size, which triggered a bug when trying to compute epoch percentage completeness (sent through logs)
Merged fix In the fix (MR !162 (merged) ), we are accessing the batch size using a if/else statement:
- if data used for training is a dictionary, batch_size is the length of the first item in the dictionary , ie
batch_size = len(list(data.values())[0])
- else we assume
data
object is alist
orTensor
like object and wrotebatch_size = len(data)
Issue with the current fix
We are making data_loader
less generic, and are hindering the fact that data_loader
could be used in a generic way, ie returns data
in any format. If in the future data
has another format/type, we will have to specify how to compute its batch_size wrt of the data/type in the TorchTrainingPlan
Idea for solving issue
NB: we cannot set or change Opacus
attributes since there are protected (investigations are needed to understand how they are protected)
- create a wrapper of
training_data_loader
- create a
BatchSize
class