KeyError: 'train_mean'
When running a classification task, we got a KeyError: "train_mean".
This can be easily reproduced with the quickstart demo, by creating a fake column (big
in the following example) with True/False value, and setting the learned params :
learned_params:
big:
type: classification
classes: 2
dnadna train BIG/BIG_training_config.yml --debug
2022-01-21 11:25:51; DEBUG; loaded built-in plugin pluggable:network (<class 'dnadna.nets.Network'>)
2022-01-21 11:25:52; DEBUG; loaded built-in plugin network:spidna (<class 'dnadna.nets.SPIDNA'>)
[...]
2022-01-21 11:25:52; DEBUG; loaded plugin dnadna.transforms (modified 2021-11-05 12:46:37.041309) providing:
2022-01-21 11:25:52; INFO; Process ID: 27904
2022-01-21 11:25:52; INFO; Preparing training run
2022-01-21 11:25:54; INFO; Initializing dataset...
2022-01-21 11:25:54; INFO; 60 samples in the validation set and 140 in the training set
2022-01-21 11:25:54; INFO; inferred parameters for CustomCNN: n_snp=500, n_indiv=50, concat=True
2022-01-21 11:25:54; INFO; Start training
2022-01-21 11:25:54; INFO; Networks states are saved after each validation step
2022-01-21 11:25:54; WARNING; Current behavior if SNP matrices have different shapes: padding with -1 (right and bottom) to fit the maximum dimension within each batch.
2022-01-21 11:25:54; INFO; Starting Epoch #1
2022-01-21 11:25:54; DEBUG; step 0
2022-01-21 11:25:54; DEBUG; got dataset
2022-01-21 11:25:54; DEBUG; predict on training
2022-01-21 11:25:57; DEBUG; <class 'torch.Tensor'>, cuda
2022-01-21 11:25:57; INFO; Validation at epoch: 1 and batch: 1
2022-01-21 11:25:57; INFO; Compute all outputs for validation dataset...
2022-01-21 11:25:58; INFO; Done
2022-01-21 11:25:58; INFO; training loss = 0.312195360660553 // validation loss = 0.37020185589790344
epoch 1/5: 0%| | 0/35 [00:04<?, ?batch/s]
2022-01-21 11:25:58; ERROR; KeyError: 'train_mean' | 0/35 [00:03<?, ?batch/s]
Traceback (most recent call last):
File "/home/tau/jcury/DNADNA_project/dnadna/dnadna/utils/cli.py", line 254, in main
ret = cls.run(args)
File "/home/tau/jcury/DNADNA_project/dnadna/dnadna/cli/train.py", line 55, in run
model_trainer.run_training(run_id=run_id, overwrite=args.overwrite)
File "/home/tau/jcury/DNADNA_project/dnadna/dnadna/training.py", line 563, in run_training
best_loss = self.train()
File "/home/tau/jcury/DNADNA_project/dnadna/dnadna/training.py", line 520, in train
return best_loss
File "/home/tau/jcury/DNADNA_project/dnadna/dnadna/utils/misc.py", line 411, in __exit__
raise exc_value
File "/home/tau/jcury/DNADNA_project/dnadna/dnadna/training.py", line 510, in train
best_loss = self._train_outer_loop(bar)
File "/home/tau/jcury/DNADNA_project/dnadna/dnadna/training.py", line 662, in _train_outer_loop
targets)
File "/home/tau/jcury/DNADNA_project/dnadna/dnadna/training.py", line 730, in _train_inner_loop
quiet=True, batch=batch, step=step)
File "/home/tau/jcury/DNADNA_project/dnadna/dnadna/training.py", line 623, in save_net
'train_mean': self.config['train_mean'],
File "/home/tau/jcury/DNADNA_project/dnadna/dnadna/utils/config.py", line 1009, in __getitem__
value = super().__getitem__(key)
File "/home/tau/jcury/DNADNA_project/dnadna/dnadna/utils/config.py", line 436, in __getitem__
raise KeyError(key)
KeyError: 'train_mean'
This comes from the save_net function which expect the train_mean/std parameters whereas those are optional.
Also I'm surprised such error still occurs, don't we have test for classification task ? We should !