Mentions légales du service

Skip to content

Fixing tensorboard issues

This fix is related to issue #165 (closed) , and addresses the following issues:

  1. Displaying losses for scikit learn 's linear models (Perceptron, SGDClassifier): losses are computed using logs outputed from the scikit model training (using stdout). dealing with multi classes parallelization is done in scikit learn (training separately binary models in a "one vs all" fashion), logs are printed for each model trained in parallel), resulting in several points instead of one. Therefore, a weighted average has been implemented to return a single loss point instead of several loss points(each from the training of model trained in a "one vs all" fashion)
  2. Correct number of points: whereas with scikitlearn, each points are displayed for each epoch, in pytorch, points are displayed depending on the value passed in the training_args argument logs_interval: so within an epoch, points are displayed for each loss batch size according the following rule:
  • one point every log_interval, over on epoch
  • interval between 2 epochs equals to 1
  • interval between 2 rounds equals to 1

let's assume log_interval = 3, and an epoch is completed over 6 batchs. we are doing one epoch and 2 epochs:

for illustration sake, we represent below how points will be displayed (x axis):

+_ _ + _ _+_+ _ + _ +_+ _ + _ _+_+ _ + _ _+

  1. It contains additional comments so it is easier to understand parameters values
Edited by BOUILLARD Yannick

Merge request reports