Exp: What is the impact of these modifications on the size of the dataset?
Since I converted all of the grayscale images (1 layer) from IAM-DB to RGBA (4 layers), I wondered how heavier it made the dataset. If gaining 3% of accuracy points means having to increase the computation cost because of the size of the data, I figured it would be something interesting to look more closely.
I get the following in the case of the IAM dataset (only regarding trainset
, testset
, validationset1
and validationset2
):
- original dataset: 626Mb
107M ../get_iam/iam_pairs/testset/
325M ../get_iam/iam_pairs/trainset/
52M ../get_iam/iam_pairs/validationset1/
62M ../get_iam/iam_pairs/validationset2/
- dataset converted to rgba: 929Mb
181M ../get_iam/iam_pairs_rgba/testset/
550M ../get_iam/iam_pairs_rgba/trainset/
87M ../get_iam/iam_pairs_rgba/validationset1/
103M ../get_iam/iam_pairs_rgba/validationset2/
- rgba dataset "blended": 3.1 Gb (3187Mb)
611M ../green/iam_pairs_blended/testset/
1.9G ../green/iam_pairs_blended/trainset/
287M ../green/iam_pairs_blended/validationset1/
309M ../green/iam_pairs_blended/validationset2