Fixing tensorboard issues (!63) · Merge requests · OBSOLETE_Fed-BioMed / OBSOLETE_Fed-BioMed

This fix is related to issue #165 (closed) , and addresses the following issues:

Displaying losses for scikit learn 's linear models (Perceptron, SGDClassifier): losses are computed using logs outputed from the scikit model training (using stdout). dealing with multi classes parallelization is done in scikit learn (training separately binary models in a "one vs all" fashion), logs are printed for each model trained in parallel), resulting in several points instead of one. Therefore, a weighted average has been implemented to return a single loss point instead of several loss points(each from the training of model trained in a "one vs all" fashion)
Correct number of points: whereas with scikitlearn, each points are displayed for each epoch, in pytorch, points are displayed depending on the value passed in the training_args argument logs_interval: so within an epoch, points are displayed for each loss batch size according the following rule:

let's assume log_interval = 3, and an epoch is completed over 6 batchs. we are doing one epoch and 2 epochs:

for illustration sake, we represent below how points will be displayed (x axis):

+_ _ + _ _+_+ _ + _ +_+ _ + _ _+_+ _ + _ _+

Edited Jan 31, 2022 by BOUILLARD Yannick

Admin message