KeyError: 'train_mean'

When running a classification task, we got a KeyError: "train_mean".

This can be easily reproduced with the quickstart demo, by creating a fake column (big in the following example) with True/False value, and setting the learned params :

 learned_params:
     big:
         type: classification
         classes: 2

dnadna train BIG/BIG_training_config.yml --debug

2022-01-21 11:25:51;    DEBUG;  loaded built-in plugin pluggable:network (<class 'dnadna.nets.Network'>)
2022-01-21 11:25:52;    DEBUG;  loaded built-in plugin network:spidna (<class 'dnadna.nets.SPIDNA'>)
[...]
2022-01-21 11:25:52;    DEBUG;  loaded plugin dnadna.transforms (modified 2021-11-05 12:46:37.041309) providing:
2022-01-21 11:25:52;     INFO;  Process ID: 27904
2022-01-21 11:25:52;     INFO;  Preparing training run
2022-01-21 11:25:54;     INFO;  Initializing dataset...
2022-01-21 11:25:54;     INFO;  60 samples in the validation set and 140 in the training set
2022-01-21 11:25:54;     INFO;  inferred parameters for CustomCNN: n_snp=500, n_indiv=50, concat=True
2022-01-21 11:25:54;     INFO;  Start training
2022-01-21 11:25:54;     INFO;  Networks states are saved after each validation step
2022-01-21 11:25:54;  WARNING;  Current behavior if SNP matrices have different shapes: padding with -1 (right and bottom) to fit the maximum dimension within each batch.
2022-01-21 11:25:54;     INFO;  Starting Epoch #1
2022-01-21 11:25:54;    DEBUG;  step 0                                                                                                                                                                             
2022-01-21 11:25:54;    DEBUG;  got dataset                                                                                                                                                                        
2022-01-21 11:25:54;    DEBUG;  predict on training                                                                                                                                                                
2022-01-21 11:25:57;    DEBUG;  <class 'torch.Tensor'>, cuda                                                                                                                                                       
2022-01-21 11:25:57;     INFO;  Validation at epoch: 1 and batch: 1                                                                                                                                                
2022-01-21 11:25:57;     INFO;  Compute all outputs for validation dataset...                                                                                                                                      
2022-01-21 11:25:58;     INFO;  Done                                                                                                                                                                               
2022-01-21 11:25:58;     INFO;  training loss = 0.312195360660553 // validation loss = 0.37020185589790344                                                                                                         
epoch 1/5:   0%|                                                                                                                                                                         | 0/35 [00:04<?, ?batch/s]
2022-01-21 11:25:58;    ERROR;  KeyError: 'train_mean'                                                                                                                                   | 0/35 [00:03<?, ?batch/s]
Traceback (most recent call last):
  File "/home/tau/jcury/DNADNA_project/dnadna/dnadna/utils/cli.py", line 254, in main
    ret = cls.run(args)
  File "/home/tau/jcury/DNADNA_project/dnadna/dnadna/cli/train.py", line 55, in run
    model_trainer.run_training(run_id=run_id, overwrite=args.overwrite)
  File "/home/tau/jcury/DNADNA_project/dnadna/dnadna/training.py", line 563, in run_training
    best_loss = self.train()
  File "/home/tau/jcury/DNADNA_project/dnadna/dnadna/training.py", line 520, in train
    return best_loss
  File "/home/tau/jcury/DNADNA_project/dnadna/dnadna/utils/misc.py", line 411, in __exit__
    raise exc_value
  File "/home/tau/jcury/DNADNA_project/dnadna/dnadna/training.py", line 510, in train
    best_loss = self._train_outer_loop(bar)
  File "/home/tau/jcury/DNADNA_project/dnadna/dnadna/training.py", line 662, in _train_outer_loop
    targets)
  File "/home/tau/jcury/DNADNA_project/dnadna/dnadna/training.py", line 730, in _train_inner_loop
    quiet=True, batch=batch, step=step)
  File "/home/tau/jcury/DNADNA_project/dnadna/dnadna/training.py", line 623, in save_net
    'train_mean': self.config['train_mean'],
  File "/home/tau/jcury/DNADNA_project/dnadna/dnadna/utils/config.py", line 1009, in __getitem__
    value = super().__getitem__(key)
  File "/home/tau/jcury/DNADNA_project/dnadna/dnadna/utils/config.py", line 436, in __getitem__
    raise KeyError(key)
KeyError: 'train_mean'

This comes from the save_net function which expect the train_mean/std parameters whereas those are optional.

Also I'm surprised such error still occurs, don't we have test for classification task ? We should !

Edited Jan 21, 2022 by Jean Cury

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information

Admin message

KeyError: 'train_mean'