Fixing tensorboard issues
This fix is related to issue #165 (closed) , and addresses the following issues:
- Displaying losses for scikit learn 's linear models (
Perceptron
,SGDClassifier
): losses are computed using logs outputed from the scikit model training (usingstdout
). dealing with multi classes parallelization is done in scikit learn (training separately binary models in a "one vs all" fashion), logs are printed for each model trained in parallel), resulting in several points instead of one. Therefore, a weighted average has been implemented to return a single loss point instead of several loss points(each from the training of model trained in a "one vs all" fashion) - Correct number of points: whereas with scikitlearn, each points are displayed for each epoch, in pytorch, points are displayed depending on the value passed in the
training_args
argumentlogs_interval
: so within an epoch, points are displayed for each loss batch size according the following rule:
- one point every
log_interval
, over on epoch - interval between 2 epochs equals to 1
- interval between 2 rounds equals to 1
let's assume log_interval = 3
, and an epoch is completed over 6 batchs. we are doing one epoch and 2 epochs:
for illustration sake, we represent below how points will be displayed (x
axis):
+_ _ + _ _+_+ _ + _ +_+ _ + _ _+_+ _ + _ _+
- It contains additional comments so it is easier to understand parameters values
Edited by BOUILLARD Yannick