MedNIST dataset issue with MONAI classification notebook
This issue is reported as MedNIST dataset issue with MONAI classification notebook, but it is probably a generic MedNIST dataset implementation issue, not related to MONAI (which is just the documented case).
When running the notebooks/monai-2d-image-classification.ipynb
notebook:
- we load the dataset as indicated, ie we download the dataset from
drive.google.com
, and then select one of the subdirectories (egclient_1
). - error:
client_1
is not recognized as a MedNIST directory and the whole dataset is downloaded and shared in Fed-BioMed. It means each node uses the whole MedNIST dataset
2023-05-24 08:57:36,763 fedbiomed DEBUG - /data/mvesin/data/MedNIST_clients/client_1
2023-05-24 08:57:36,763 fedbiomed INFO - PATH VALUE /data/mvesin/data/MedNIST_clients/client_1
2023-05-24 08:57:36,764 fedbiomed INFO - Now downloading MEDNIST...
2023-05-24 08:57:47,006 fedbiomed INFO - Now extracting MEDNIST...
...
MEDNIST mednist ['#MEDNIST', '#dataset'] MEDNIST dataset [58954, 3, 64, 64] /data/mvesin/data/MedNIST_clients/client_1/MedNIST dataset_03c1cd00-3fde-4498-ad9a-119c38ea1289
MedNIST data loader is looking for a subdirectory in the selected dir, which is named MedNIST
and contains proper structure. For example if we rename client_1
to MedNIST
and select the upper directory, we properly load the subset of MedNIST
as expected (see the number of samples: only using a subset of the whole MedNIST
now):
$ mv client_1 MedNIST
$ ./scripts/fedbiomed_run node add
...
2023-05-24 09:12:41,998 fedbiomed DEBUG - /data/mvesin/data/MedNIST_clients
2023-05-24 09:12:41,998 fedbiomed INFO - PATH VALUE /data/mvesin/data/MedNIST_clients
...
MEDNIST mednist ['#MEDNIST', '#dataset'] MEDNIST dataset [18000, 3, 64, 64] /data/mvesin/data/MedNIST_clients/MedNIST dataset_dc90e1f7-a841-4780-b138-922a9b28f665
Dear user, please keep the lign below when submitting an issue: