Mentions légales du service

Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • ssilvari/flhd
  • mlorenzi/flhd
2 results
Show changes
Commits on Source (34)
Showing
with 7977 additions and 3068 deletions
......@@ -4,10 +4,12 @@ before_script:
- pip install -r requirements-site.txt
pages:
tags:
- ci.inria.fr
- small
stage: deploy
script:
- jupyter-book build .
- mv _build/jupyter_execute .
- mv _build/html public
artifacts:
paths:
......
title: "AI4health winter school - Practical Session on Handling heterogeneity in the analysis of biomedical information"
author: "Practical Session on Handling heterogeneity in the analysis of biomedical information Team"
logo: "assets/img/logo.jpg"
title: "2023 AI4Health summer school - Practical Session on Fed-BioMed, an open source framework for federated learning in real world healthcare applications"
author: "Lucia Innocenti and Francesco Cremonesi"
logo: "assets/img/logo.png"
only_build_toc_files: true
exclude_patterns: [_build, Thumbs.db, .DS_Store, "**.ipynb_checkpoints", "README.md"]
......@@ -8,7 +8,7 @@ execute:
execute_notebooks : "off"
repository:
url: https://gitlab.inria.fr/epione/flhd
url: https://github.com/fedbiomed/fedbiomed
path_to_book : "/" # A path to your book's folder, relative to the repository root.
branch : master # Which branch of the repository should be used when creating links
......
- file: index.md
title: Welcome
- part: Multivariate models for the analysis of heterogeneous information
chapters:
- file: heterogeneous_data/introduction.md
# - file: heterogeneous_data/generate_pseudo_adni_dataset.ipynb
- file: heterogeneous_data/heterogeneous_data.ipynb
title: "Multivariate association models for the analysis of heterogeneous data"
- part: Federated Learning
chapters:
- file: federated_learning/introduction.md
- file: federated_learning/FedAvg_FedProx_MNIST_iid_and_noniid.ipynb
title: FedAVG and FedProx
- file: federated_learning/federated_mcvae.ipynb
title: Federated VAEs
- file: federated_learning/mcvae_rotated_mnist.ipynb
title: Federated Multi-channel VAEs on MNIST
- file: federated_learning/federated_mcvae_adni.ipynb
title: Federated Multi-channel VAEs on biomedical data
- part: About
chapters:
- file: contributors.md
# - part: federated Learning
# chapters:
# - file: federated_learning/introduction.md
format: jb-book
root: index.md
title: Welcome
parts:
- caption: Workshop instructions
chapters:
- file: fedbiomed-tutorial/slides.md
- file: fedbiomed-tutorial/aws-instructions.md
- file: fedbiomed-tutorial/tensorboard-instructions.md
- caption: Tutorial exercises
chapters:
- file: fedbiomed-tutorial/intro-tutorial-mednist.ipynb
title: Intro tutorial (MedNIST)
- file: fedbiomed-tutorial/tutorial-sklearn-problem.ipynb
title: Heart disease detection
- file: fedbiomed-tutorial/tutorial-sklearn-solutions.ipynb
title: Heart disease detection
- file: fedbiomed-tutorial/brain-segmentation-exercise.ipynb
title: Brain segmentation
- file: fedbiomed-tutorial/brain-segmentation-solution.ipynb
title: Brain segmentation - Solution
- caption: FAQ
chapters:
- file: faq/faq.md
assets/img/logo.jpg

17.7 KiB

assets/img/logo.png

18.4 KiB

#
# environment for fedbiomed-researcher
#
#
#
name: fedbiomed-researcher
channels:
- conda-forge
dependencies:
# minimal environment
- python >=3.9,<3.10
- pip
- jupyter
- ipython
# tests
- tinydb >=4.4.0,<5.0.0
- tabulate >=0.8.9,<0.9.0
# tools
- colorama
- pyyaml
# code
- GitPython >=3.1.14,<4.0.0
- requests >=2.25.1,<3.0.0
- paho-mqtt >=1.5.1,<2.0.0
- validators >=0.18.2,<0.19.0
- tqdm >=4.59.0,<5.0.0
- git
- packaging >=23.0,<24.0
# these two have to be aligned
- cryptography ~=39.0
- pyopenssl ~=23.0
# git notebook striper
- nbstripout
- joblib >=1.0.1
# sklearn
# scipy >= 1.9 from conda-forge needs recent GLIBC thus causes issue 389
# with many current systems
# another option is to install scipy from pip
- scipy >=1.8.0,<1.9.0
- scikit-learn >=1.0.0,<1.1.0
# other
- itk
- pip:
# nn
- torch >=1.8.0,<2.0.0
- torchvision >=0.9.0,<0.15.0
- opacus >=1.2.0,<1.3.0
- monai >=1.1.0,<1.2.0
# other
- msgpack ~=1.0
- persist-queue >=0.5.1,<0.6.0
- pandas >=1.2.3,<2.0.0
- openpyxl >= 3.0.9,<3.1
- tensorboard
- JSON-log-formatter
- python-minifier ==2.5.0
- pathvalidate
# declearn
- declearn[torch] ~= 2.1.0
- gmpy2 >=2.1,< 2.2
#### Notebook-specific packages ####
# This section contains packages that are needed only to run specific notebooks
- unet == 0.7.7
# Frequently Asked Questions
## Will JupyterHub be accessible after the workshop?
Unfortunately, no. We will shutdown the jupyterhub server at the end of each workshop day.
## What happens if I close a notebook or a terminal tab?
No worries, the underlying process will continue to run. To reopen it, navigate to the jupytehub homepage (the IP address provided [here](/fedbiomed-tutorial/aws-instructions)) and click on the "Running" tab in the top left.
## How to solve "fedbiomed Module Not Found" error
This may be due to the wrong jupyter kernel being selected. In your notebook, under the `Kernel` tab, go to `Change kernel` and select `fedbiomed-researcher`.
## After abruptly closing the node, it automatically starts executing a training when I restart it
This is a common issue, due to the fact that the task was not correctly deleted from the node's queue. Currently, the best option is to delete the folder which contains all of the node's queue information. First, find your node id (you can find it in the `fedbiomed/etc/site_*.ini` files), then remove the folder `rm -r fedbiomed/var/queue_manager_<node id>`
# Instructions for the tutorial
Fed-BioMed is an open-source research and development initiative for translating federated learning into real-world medical applications.
The community of Fed-BioMed gathers experts in medical engineering, machine learning, communication, and security.
We all contribute to provide an open, user-friendly, and trusted framework for deploying the state-of-the-art of federated learning in sensitive environments, such as in hospitals and health data lakes.
Check out our [fedbiomed.org](https://fedbiomed.org) for the latest documentation and news!
## Connecting to JupyterHub
We provide a ready-to-use Jupyterhub instance running on AWS for you.
To use it, check the table below and copy the url associated with your username in a browser.
### Friday, July 7th
| user | address |
| --- | --- |
| francesco | http://34.251.43.96 |
| lucia | http://34.251.43.96 |
| yannick\_shaofeng | http://54.194.207.88 |
| emma | http://54.194.207.88 |
| celine | http://54.194.207.88 |
| isabella | http://54.194.207.88 |
| brice | http://54.194.207.88 |
| melanie | http://54.194.207.88 |
| theo | http://54.194.207.88 |
| javier | http://34.251.43.96 |
| hugo | http://34.251.43.96 |
| aleksandar | http://34.251.43.96 |
| narges | http://34.251.43.96 |
| charlotte | http://34.251.43.96 |
| hanae | http://34.251.43.96 |
| yevgenyi | http://34.244.73.13 |
| lea | http://34.244.73.13 |
| theodore | http://34.244.73.13 |
| takoua | http://34.244.73.13 |
| paula | http://34.244.73.13 |
| quoc\_viet | http://34.244.73.13 |
| franka | http://34.244.73.13 |
You may log in with the password provided to you by the workshop presenters.
If you wish, you may also [install](https://fedbiomed.org/latest/tutorials/installation/0-basic-software-installation/) Fed-BioMed locally on your machine. If you have any questions about local installation we can try to quickly help during the workshop, or you may ask support questions through our [mailing list](mailto:fedbiomed-support@inria.fr) and our [user discord channel](https://discord.gg/SWUb7QAS).
## Launching the Fed-BioMed components
In today's tutorial, you will be launching one researcher and two node components.
### The Fed-BioMed node
We have prepare two nodes for you, called `site_1.ini` and `site_2.ini`.
To run them, open two terminals in Jupyterhub and navigate to the fedbiomed folder in both of them
```bash
cd $HOME/fedbiomed
```
Then, in the first terminal execute
```bash
./scripts/fedbiomed_run node config site_1.ini start
```
while in the other terminal you should execute
```bash
./scripts/fedbiomed_run node config site_2.ini start
```
Note that after this command, the terminal is fully dedicated to running the node, and you may not execute other commands on that terminal unless you stop the node's execution beforehand.
To stop the node, use the `Ctrl+C` combination.
This diff is collapsed.
This diff is collapsed.
%% Cell type:markdown id:c1516f38 tags:
# Intro tutorial (MedNIST)
%% Cell type:code id:5cc104e6 tags:
``` python
%load_ext tensorboard
```
%% Cell type:markdown id:c1516f37 tags:
## Nodes inspection
%% Cell type:markdown id:024a8a5e tags:
First thing, let's check which nodes are available for training and their characteristics:
%% Cell type:code id:9f6f30f6 tags:
``` python
from fedbiomed.researcher.requests import Requests
req = Requests()
req.list(verbose=True)
```
%% Cell type:code id:583f88a6 tags:
``` python
import os
import numpy as np
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader
from fedbiomed.common.training_plans import TorchTrainingPlan
from fedbiomed.common.data import DataManager
from torchvision import datasets, transforms
from monai.apps import download_and_extract
from monai.config import print_config
from monai.data import decollate_batch
from monai.metrics import ROCAUCMetric
from monai.networks.nets import DenseNet121
from monai.transforms import (
Activations,
AddChannel,
AsDiscrete,
Compose,
LoadImage,
RandFlip,
RandRotate,
RandZoom,
ScaleIntensity,
EnsureType,
)
from monai.utils import set_determinism
from torch.optim import AdamW, Adam, SGD
import matplotlib.pyplot as plt
import PIL
import pandas as pd
from tqdm import tqdm
```
%% Cell type:markdown id:2a51bf27 tags:
## Training plan definition
%% Cell type:markdown id:f45f387b tags:
A Training Plan contains the recipe for executing the training loop on the nodes. It defines: the data, the model, the loss function, and the optimizer. The code in the training plan is shipped in its entirety to the nodes, where its different parts are executed at different times during the training loop.
Our example cointains:
1) a model instance
2) an optimizer instance
3) a list of dependencies (i.e. modules to be imported before instantiating the model and optimizer)
4) how to load the training data (and potential preprocessing)
5) a loss function
%% Cell type:code id:62d28d22 tags:
``` python
class TrainingPlan(TorchTrainingPlan):
class MedNISTDataset(torch.utils.data.Dataset):
def __init__(self, image_files, labels, transforms):
self.image_files = image_files
self.labels = labels
self.transforms = transforms
def __len__(self):
return len(self.image_files)
def __getitem__(self, index):
return self.transforms(self.image_files[index]), self.labels[index]
def init_model(self, model_args):
model = DenseNet121(spatial_dims=2, in_channels=1,
out_channels = model_args["num_class"])
return model
def init_dependencies(self):
# Here we define the custom dependencies that will be needed by our custom Dataloader
deps = ["import numpy as np",
"import os",
"from monai.apps import download_and_extract",
"from monai.config import print_config",
"from monai.data import decollate_batch",
"from monai.metrics import ROCAUCMetric",
"from monai.networks.nets import DenseNet121",
"from torch.optim import AdamW, Adam, SGD",
"from monai.transforms import ( Activations, AddChannel, AsDiscrete, Compose, LoadImage, RandFlip, RandRotate, RandZoom, ScaleIntensity, EnsureType, )",
"from monai.utils import set_determinism"]
return deps
def parse_data(self, path):
class_names = sorted(x for x in os.listdir(path) if os.path.isdir(os.path.join(path, x)))
num_class = len(class_names)
image_files = [
[
os.path.join(path, class_names[i], x)
for x in os.listdir(os.path.join(path, class_names[i]))
]
for i in range(num_class)
]
return image_files, num_class
def training_data(self, batch_size = 32):
self.image_files, num_class = self.parse_data(self.dataset_path)
if self.model_args()["num_class"] != num_class:
raise Exception('number of available classes does not match declared classes')
num_each = [len(self.image_files[i]) for i in range(self.model_args()["num_class"])]
image_files_list = []
image_class = []
for i in range(self.model_args()["num_class"]):
image_files_list.extend(self.image_files[i])
image_class.extend([i] * num_each[i])
train_transforms = Compose(
[
LoadImage(image_only=True),
AddChannel(),
ScaleIntensity(),
RandRotate(range_x=np.pi / 12, prob=0.5, keep_size=True),
RandFlip(spatial_axis=0, prob=0.5),
RandZoom(min_zoom=0.9, max_zoom=1.1, prob=0.5),
EnsureType(),
]
)
self.train_ds = self.MedNISTDataset(image_files_list, image_class, train_transforms)
return DataManager(dataset=self.train_ds, batch_size=batch_size, shuffle=True)
def training_step(self, data, target):
output = self.model().forward(data)
loss = torch.nn.functional.cross_entropy(output, target)
return loss
```
%% Cell type:markdown id:82da9007 tags:
## Experiment definition
%% Cell type:code id:cac8dd44 tags:
``` python
model_args = {
'num_class': 6,
}
training_args = {
#'use_gpu': True,
'batch_size': 20,
'optimizer_args': {
'lr': 1e-5
},
'num_updates': 5,
'dry_run': False,
}
```
%% Cell type:markdown id:690d617b tags:
By changing the elements in tags we can do client selection. \
As we saw, **client1** is defined by ['mednist-jupyter-username', 'client1'] and **client2** is defined by ['mednist-jupyter-username', 'client2']
If we want to train a model with only **client1**, we can set
*tags = ['mednist-jupyter-username', 'client1']*
%% Cell type:markdown id:bef703f8 tags:
<div class="alert alert-block alert-info"> <b>TAGS:</b> Replace %%%% in the tags with your username </div>
%% Cell type:code id:9145caad tags:
``` python
from fedbiomed.researcher.experiment import Experiment
from fedbiomed.researcher.aggregators.fedavg import FedAverage
tags = ['mednist-jupyter-%%%%']
rounds = 3
exp = Experiment(tags=tags,
model_args=model_args,
training_plan_class=TrainingPlan,
training_args=training_args,
round_limit=rounds,
aggregator=FedAverage(),
node_selection_strategy=None,
#tensorboard=True,
#save_breakpoints=True
)
```
%% Cell type:code id:68723ce1 tags:
``` python
exp.run()
```
%% Cell type:markdown id:5370b869 tags:
## Tensorboard: how to follow your run progresses
Follow the [instructions](https://ai4health-2023.gitlabpages.inria.fr/ai4health-fedbiomed.gitlabpages.inria.fr/fedbiomed-tutorial/tensorboard-instructions.html) to obtain a port number, and run the commands below.
If \<IP\> is the IP assigned to you at this workshop, you may also view the tensorboard in a new browser tab at
http://\<IP\>:\<YOUR PORT NUMBER\>
%% Cell type:code id:2da2573e tags:
``` python
from fedbiomed.researcher.environ import environ
tensorboard_dir = environ['TENSORBOARD_RESULTS_DIR']
```
%% Cell type:code id:17891e1e tags:
``` python
tensorboard --logdir "$tensorboard_dir" --host 0.0.0.0 --port <YOUR PORT NUMBER>
```
%% Cell type:code id:d7c5287b tags:
``` python
exp = Experiment(tags=tags,
model_args=model_args,
training_plan_class=TrainingPlan,
training_args=training_args,
round_limit=rounds,
aggregator=FedAverage(),
node_selection_strategy=None,
tensorboard=True,
#save_breakpoints=True
)
exp.run()
```
%% Cell type:markdown id:b845e200 tags:
## Retrieving the saved model
%% Cell type:code id:8ba46d2c tags:
``` python
trained_model = exp.training_plan().model()
trained_model.load_state_dict(exp.aggregated_params()[rounds - 1]['params'])
```
%% Cell type:code id:3493cf0e tags:
``` python
trained_model
```
%% Cell type:markdown id:0c5f3e94 tags:
## Testing the model on a local dataset
%% Cell type:code id:a7c60676 tags:
``` python
class MedNISTDataset(torch.utils.data.Dataset):
def __init__(self, image_files, labels, transforms):
self.image_files = image_files
self.labels = labels
self.transforms = transforms
def __len__(self):
return len(self.image_files)
def __getitem__(self, index):
return self.transforms(self.image_files[index]), self.labels[index]
```
%% Cell type:code id:fa84d474 tags:
``` python
def training_data(dataset_path, batch_size = 32):
def parse_data(path):
class_names = sorted(x for x in os.listdir(path) if os.path.isdir(os.path.join(path, x)))
num_class = len(class_names)
image_files = [
[
os.path.join(path, class_names[i], x)
for x in os.listdir(os.path.join(path, class_names[i]))
]
for i in range(num_class)
]
return image_files, num_class
image_files, num_class = parse_data(dataset_path)
num_each = [len(image_files[i]) for i in range(num_class)]
image_files_list = []
image_class = []
for i in range(num_class):
image_files_list.extend(image_files[i])
image_class.extend([i] * num_each[i])
transforms = Compose(
[LoadImage(image_only=True), AddChannel(), ScaleIntensity(), EnsureType()])
ds = MedNISTDataset(image_files_list, image_class, transforms)
return DataLoader(dataset=ds, batch_size=batch_size, shuffle=False)
```
%% Cell type:code id:137bac5b tags:
``` python
def testing_accuracy(model, data_loader):
model.eval()
loss = 0
correct = 0
device = 'cpu'
correct = 0
y_pred = []
y_actu = []
with torch.no_grad():
for data, target in tqdm(data_loader, desc=f"Evaluation"):
data, target = data.to(device), target.to(device)
output = model(data)
loss += torch.nn.functional.cross_entropy(output, target, reduction='sum').item() # sum up batch loss
pred = output.argmax(dim=1, keepdim=True) # get the index of the max log-probability
correct += pred.eq(target.view_as(pred)).sum().item()
y_pred.extend(torch.flatten(pred).tolist())
y_actu.extend(target.tolist())
y_pred = pd.Series(y_pred, name='Actual')
y_actu = pd.Series(y_actu, name='Predicted')
cm = confusion_matrix(y_actu, y_pred, labels=range(6))
loss /= len(data_loader.dataset)
accuracy = 100* correct/len(data_loader.dataset)
return(loss, accuracy, cm)
```
%% Cell type:code id:4d0d49c8 tags:
``` python
test_client_path = '/datasets/MedNIST/client_3'
test_dl = training_data(test_client_path)
```
%% Cell type:code id:56f65c84 tags:
``` python
test_loss, test_accuracy, test_cm = testing_accuracy(trained_model, test_dl)
```
%% Cell type:code id:1d985c01 tags:
``` python
print(f"Test loss = {test_loss:.2f}")
print(f"Test accuracy = {test_accuracy:.2f}%")
test_cm
disp = ConfusionMatrixDisplay(confusion_matrix=test_cm,
display_labels=range(6))
disp.plot()
plt.show()
```
%% Cell type:markdown id:938955de tags:
## Compare different aggregators: FedProx
%% Cell type:markdown id:188e30b8 tags:
Similar to FedAveraging, FedProx performs a weighted sum of local model parameters. FedProx however introduces a regularization operation in order to tackle statistical heterogeneity.
To use FedProx, use FedAverage from fedbiomed.researcher.aggregators and specify a value for in the training arguments training_args using the argument name fedprox_mu.
%% Cell type:markdown id:83d4ad47 tags:
Try to change the mu value to see how this impact the performances.
%% Cell type:code id:be68cc84 tags:
``` python
model_args = {
'num_class': 6,
}
training_args = {
#'use_gpu': True,
'batch_size': 20,
'optimizer_args': {
'lr': 1e-5
},
'num_updates': 5,
'dry_run': False,
'fedprox_mu': 0.1,
}
```
%% Cell type:code id:cc00f668 tags:
``` python
exp = Experiment(tags=tags,
model_args=model_args,
training_plan_class=TrainingPlan,
training_args=training_args,
round_limit=rounds,
aggregator=FedAverage(),
node_selection_strategy=None,
tensorboard=True,
#save_breakpoints=True
)
exp.run()
```
# Workshop slides
You may find a copy of the slides on [google drive](https://docs.google.com/presentation/d/1mfzVuRQXASaPj9OU_5HCA_X4Cjf79UthbpLUyeDQyF4/edit?usp=sharing).
# Using Tensorboard during the tutorial
Prerequisites:
- the FL experiment should be ready to run in your jupyter notebook
- your JupyterHub IP address
- a network port from the table below
## Network ports
### Friday, July 7th
| user | address |
| --- | --- |
| francesco | 6006 |
| lucia | 6007 |
| yannick\_shaofeng | 6006 |
| emma | 6007 |
| celine | 6008 |
| isabella | 6009 |
| brice | 6010 |
| melanie | 6011 |
| theo | 6012 |
| javier | 6008 |
| hugo | 6009 |
| aleksandar | 6010 |
| narges | 6011 |
| charlotte | 6012 |
| hanae | 6013 |
| yevgenyi | 6006 |
| lea | 6007 |
| theodore | 6008 |
| takoua | 6009 |
| paula | 6010 |
| quoc\_viet | 6011 |
| franka | 6012 |
### Thursday, July 6th
| user | port |
| --- | --- |
| francesco | 6006 |
| lucia | 6007 |
| fouzi | 6008 |
| david | 6009 |
| iege | 6010 |
| idan | 6011 |
| colleen | 6012 |
| shambhavi | 6013 |
| olivier | 6014 |
| rebeca | 6008 |
| yannick\_shaofeng | 6009 |
| charles\_andrew | 6010 |
| jack | 6011 |
| abhishek | 6012 |
| maelys | 6013 |
| valentin | 6014 |
| nilesh | 6008 |
| floriane | 6009 |
| aymeric | 6010 |
| stanislas | 6011 |
| jorge | 6012 |
| camille | 6013 |
## Running Tensorboard
Follow the instructions provided in the notebook tutorials. It amounts to finding out in which directory the tensorboard data are saved, and executing
```bash
tensorboard --logdir <PATH TO TENSORBOARD DATA> --host 0.0.0.0 --port <PORT FROM TABLE ABOVE>
```
## Accessing Tensorboard
Copy the IP address that you used to access JupyterHub, and append `:<your port number>` at the end and paste it in your browser.
For example, if your address was `1.2.3.4` and your port is `8456`, you would insert `http://1.2.3.4:8456` in your browser search bar.
%% Cell type:markdown id:64e87007 tags:
# Heart disease detection
%% Cell type:markdown id:9e351015 tags:
In this tutorial, we will focus on applying federated learning techniques to a classification problem using Scikit-Learn, a popular machine learning library in Python. We will walk you through the process step by step, from setting up the federated learning environment to evaluating the model's performance.
Scikit-Learn, also known as sklearn, is a popular machine learning library in Python. It provides a wide range of tools and algorithms for tasks such as data preprocessing, feature selection, model training, and evaluation. Sklearn is widely used for tasks such as classification, regression, clustering, and dimensionality reduction. It offers a user-friendly interface and integrates well with other libraries in the Python ecosystem, making it a go-to choice for many machine learning practitioners and researchers.
%% Cell type:code id:ade4cbea tags:
``` python
%load_ext autoreload
%autoreload 2
```
%% Cell type:markdown id:5827f560 tags:
# Table of content
1. [The dataset](#dataset)
2. [Task 1: training plan](#task1)
3. [Task 2: the experment](#task2)
4. [Task 3: model validation](#task3)
%% Cell type:markdown id:59d00ae5 tags:
# Tutorial
%% Cell type:markdown id:b9036f7d tags:
## The dataset <a name="dataset"></a>
%% Cell type:markdown id:e4f03a34 tags:
The Heart Disease dataset available at https://archive.ics.uci.edu/dataset/45/heart+disease is a widely used dataset in the field of cardiovascular research and machine learning. It contains a collection of medical attributes from patients suspected of having heart disease, along with their corresponding diagnosis (presence or absence of heart disease). The dataset includes information such as age, sex, blood pressure, cholesterol levels, and various other clinical measurements.
It was collected in 4 hospitals in the USA, Switzerland and Hungary. This dataset contains tabular information about 740 patients distributed among these four clients.
A federated version of this dataset has been proposed in [Flamby](https://arxiv.org/pdf/2210.04620.pdf). Following thier actions, we preprocess the dataset by removing missing values and encoding non-binary categorical variables as dummy variables. We finally obtain the following centers:
| Number | Client | Dataset size |
|--------|----------------------|--------------|
| 0 | Cleveland’s Hospital | 303 |
| 1 | Hungarian Hospital | 261 |
| 2 | Switzerland Hospital | 46 |
| 3 | Long Beach Hospital | 130 |
%% Cell type:markdown id:6d8554fa tags:
For teaching purposes, we decided to merge: client0 with client3 and client1 with client2.
The final federated scenario, in this way, is the following:
- **client1**, with 349 elements
- **client2**, with 391 elements
%% Cell type:markdown id:e860a74e tags:
## Task 1: Defining the training plan <a name="task1"></a>
%% Cell type:markdown id:fd2daf01 tags:
A training plan is a class that defines the four main components of federated model training: the data, the model, he loss and the optimizer. It is responsible for providing custom methods allowing every node to perform the training.
In the case of scikit-learn, Fed-BioMed already does a lot of the heavy lifting for you by providing the FedPerceptron, FedSGDClassifier and FedSGDRegressor classes as training plans. These classes already take care of the model, optimizer, loss function and related dependencies for you, so you only need to define how the data will be loaded.
In this tutorial we are going to use an [SGDClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDClassifier.html), so the related FedSGDClassifier training plan.
%% Cell type:markdown id:11ec284b tags:
### Model arguments
*model_args* is a dictionary with the arguments related to the model, that will be passed to the Perceptron constructor.
**IMPORTANT** For classification tasks, you are required to specify the following two fields:
- n_features: the number of features in each input sample (in our case, the number of pixels in the images)
- n_classes: the number of classes in the target data
Other model arguments depend on the specific model you are using, and are defined in the model definition. Refer to the model documentation
### Training arguments
*training_args* is a dictionary containing the arguments for the training routine (e.g. batch size, learning rate, epochs, etc.). This will be passed to the routine on the node side.
**IMPORTANT** To set the training arguments we may either pass them to the Experiment constructor, or set them on an instance with the setter method:
'exp.set_training_arguments(training_args=training_args)'
The setters are available also for single training arguments, like:
'exp.set_aggregator(aggregator=FedAverage)'
%% Cell type:markdown id:7571ab66 tags:
**TO_DO:**
- Apply the scaler to your data
- Define training args: num_updates, batch_size.
- Define model args as explained above.
%% Cell type:code id:998bb584 tags:
``` python
from fedbiomed.common.training_plans import FedSGDRegressor, FedPerceptron, FedSGDClassifier
from fedbiomed.common.data import DataManager
from sklearn.preprocessing import MinMaxScaler
class SkLearnClassifierTrainingPlan(FedSGDClassifier):
def init_dependencies(self):
"""Define additional dependencies.
return ["from torchvision import datasets, transforms",
"from torch.utils.data import DataLoader"]
def training_data(self, batch_size):
In this case, we rely on torchvision functions for preprocessing the images.
"""
return ["from sklearn.preprocessing import MinMaxScaler"]
def training_data(self, batch_size):
df = pd.read_csv(self.dataset_path, delimiter=';', header=None)
X = df.iloc[:, :-1]
y = df.iloc[:, -1]
self.scaler = MinMaxScaler()
# X = [...] TODO: apply the transformer to the data.
return DataManager(dataset=X,target=y, batch_size=batch_size, shuffle=True)
```
%% Cell type:code id:653e8aea tags:
``` python
n_features = 18
n_classes = 2
model_args = { 'max_iter':100,
'tol': 1e-1 ,
'loss': 'huber',
# [...] TODO: Insert the missing model arguments.
}
training_args = {
# [...] TODO: Insert the training arguments as elements in the dic.
}
```
%% Cell type:markdown id:b3020aa6 tags:
## Task 2: the Experiment <a name="task2"></a>
%% Cell type:markdown id:9ae58ccf tags:
The experiment enables Federated Learning by orchestrating the training process across multiple nodes. It searches for datasets based on specific tags, uploads the training plan file, sends model and training arguments, tracks and checks training progress, and downloads and aggregates model parameters for the next round.
%% Cell type:markdown id:c17ef104 tags:
**TO_DO:**
- Define the used training plan.
- Pass model and training arguments
%% Cell type:markdown id:28f602da tags:
<div class="alert alert-block alert-info"> <b>TAGS:</b> Replace %%%% in the tags with your username </div>
%% Cell type:code id:4b1a1341 tags:
``` python
from fedbiomed.researcher.experiment import Experiment
from fedbiomed.researcher.aggregators.fedavg import FedAverage
tags = ['heart-jupyter-%%%%']
rounds = 10
# search for corresponding datasets across nodes datasets
exp = Experiment(tags=tags,
model_args=None, #TODO: insert the correct value
training_plan_class=None, #TODO: insert the correct value
training_args=None, #TODO: insert the correct value
round_limit=rounds,
aggregator=FedAverage(),
node_selection_strategy=None)
```
%% Cell type:code id:d6ff55da tags:
``` python
exp.run()
```
%% Cell type:markdown id:88e2a782 tags:
## Task 3: Model Validation <a name="task3"></a>
%% Cell type:markdown id:84f2ad10 tags:
During federated training, model validation plays a crucial role in assessing performance without a dedicated holdout dataset. Fed-BioMed enables separate model validation on each node after parameter updates, allowing comparison of model performances. Two types of validation can be performed:
- one on globally updated parameters before training a round,
- another on locally updated parameters after local training is completed on a node.
This helps users evaluate the impact of node-specific training on model improvement.
%% Cell type:markdown id:1cbc4d1a tags:
Here is the list of validation arguments that can be configured.
- *test_ratio*: Ratio of the validation partition of the dataset. The remaining samples will be used for training. By default, it is 0.0.
- *test_on_global_updates*: Boolean value that indicates whether validation will be applied to globally updated (aggregated) parameters (see Figure 1). Default is False
- *test_on_local_updates*: Boolean value that indicates whether validation will be applied to locally updated (trained) parameters (see Figure 1). Default is False
- *test_metric*: One of MetricTypes that indicates which metric will be used for validation. It can be str or an instance of MetricTypes (e.g. MetricTypes.RECALL or RECALL ). If it is None and there isn't testing_step defined in the training plan (see section: Define Custom Validation Step) default metric will be ACCURACY.
- *test_metric_args*: A dictionary that contains the arguments that will be used for the metric function.
%% Cell type:markdown id:700d8da1 tags:
**TO_DO:**
- Initialize a new experiements.
- Use the setters to define the validation arguments.
- Launch the training and check the validation performances.
%% Cell type:code id:9972f57d tags:
``` python
exp = Experiment(
# [...]
training_args=training_args
)
#TODO: set the parameters using the setters. Example: exp.set_test_ratio(test_ratio=0.1)
```
%% Cell type:code id:827b9920 tags:
``` python
exp.run()
```
This diff is collapsed.
import torch
from torchvision import datasets
from torchvision import transforms
import matplotlib.pyplot as plt
def non_iid_split(dataset, nb_nodes, n_samples_per_node, batch_size, shuffle, shuffle_digits=False):
assert(nb_nodes>0 and nb_nodes<=10)
digits=torch.arange(10) if shuffle_digits==False else torch.randperm(10, generator=torch.Generator().manual_seed(0))
# split the digits in a fair way
digits_split=list()
i=0
for n in range(nb_nodes, 0, -1):
inc=int((10-i)/n)
digits_split.append(digits[i:i+inc])
i+=inc
# load and shuffle nb_nodes*n_samples_per_node from the dataset
loader = torch.utils.data.DataLoader(dataset,
batch_size=nb_nodes*n_samples_per_node,
shuffle=shuffle)
dataiter = iter(loader)
images_train_mnist, labels_train_mnist = dataiter.next()
data_splitted=list()
for i in range(nb_nodes):
idx=torch.stack([y_ == labels_train_mnist for y_ in digits_split[i]]).sum(0).bool() # get indices for the digits
data_splitted.append(torch.utils.data.DataLoader(torch.utils.data.TensorDataset(images_train_mnist[idx], labels_train_mnist[idx]), batch_size=batch_size, shuffle=shuffle))
return data_splitted
def iid_split(dataset, nb_nodes, n_samples_per_node, batch_size, shuffle):
# load and shuffle n_samples_per_node from the dataset
loader = torch.utils.data.DataLoader(dataset,
batch_size=n_samples_per_node,
shuffle=shuffle)
dataiter = iter(loader)
data_splitted=list()
for _ in range(nb_nodes):
data_splitted.append(torch.utils.data.DataLoader(torch.utils.data.TensorDataset(*(dataiter.next())), batch_size=batch_size, shuffle=shuffle))
return data_splitted
def get_MNIST(type="iid", n_samples_train=200, n_samples_test=100, n_clients=3, batch_size=25, shuffle=True):
dataset_loaded_train = datasets.MNIST(
root="./data",
train=True,
download=True,
transform=transforms.ToTensor()
)
dataset_loaded_test = datasets.MNIST(
root="./data",
train=False,
download=True,
transform=transforms.ToTensor()
)
if type=="iid":
train=iid_split(dataset_loaded_train, n_clients, n_samples_train, batch_size, shuffle)
test=iid_split(dataset_loaded_test, n_clients, n_samples_test, batch_size, shuffle)
elif type=="non_iid":
train=non_iid_split(dataset_loaded_train, n_clients, n_samples_train, batch_size, shuffle)
test=non_iid_split(dataset_loaded_test, n_clients, n_samples_test, batch_size, shuffle)
else:
train=[]
test=[]
return train, test
def plot_samples(data, channel:int, title=None, plot_name="", n_examples =20):
n_rows = int(n_examples / 5)
plt.figure(figsize=(1* n_rows, 1*n_rows))
if title: plt.suptitle(title)
X, y= data
for idx in range(n_examples):
ax = plt.subplot(n_rows, 5, idx + 1)
image = 255 - X[idx, channel].view((28,28))
ax.imshow(image, cmap='gist_gray')
ax.axis("off")
if plot_name!="":plt.savefig(f"plots/"+plot_name+".png")
plt.tight_layout()
\ No newline at end of file
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""This code to create a custom MNIST dataset was made possible thanks to
https://github.com/LaRiffle/collateral-learning .
Important to know that aside the tampering I did on the build_dataset function
for my own application, I also had to change rgba_to_rgb. Indeed, the function
was working as desired on Jupyter but not on Spyder. Do not ask me why !
"""
import matplotlib.pyplot as plt
import numpy as np
from scipy.ndimage.interpolation import map_coordinates
from scipy.ndimage.filters import gaussian_filter
import pickle
import torch
import math
import os
import torchvision.datasets as datasets
import torchvision.transforms as transforms
from torch.utils.data import Dataset,DataLoader
"""PLOT FUNCTIONS TO VISUALIZE THE FONTS AND DATASETS"""
def show_original_font(family:str):
"""Plot the original numbers used to create the dataset"""
plt.figure()
plt.title(family)
plt.text(0, 0.4, '1234567890', size=50, family=family)
plt.axis("off")
plt.tight_layout()
plt.savefig(f"plots/{family}_original.png")
def convert_to_rgb(data):
def rgba_to_rgb(rgba):
return rgba[1:]
return np.apply_along_axis(rgba_to_rgb, 2, data)
def elastic_transform(image, alpha, sigma, random_state=None):
"""Elastic deformation of images as described in [Simard2003]_.
.. [Simard2003] Simard, Steinkraus and Platt, "Best Practices for
Convolutional Neural Networks applied to Visual Document Analysis", in
Proc. of the International Conference on Document Analysis and
Recognition, 2003.
"""
if random_state is None:
random_state = np.random.RandomState(None)
shape = np.array([28, 28, 3], dtype =int)
dx = gaussian_filter((random_state.rand(*shape) * 2 - 1), sigma, mode="constant", cval=0) * alpha
dy = gaussian_filter((random_state.rand(*shape) * 2 - 1), sigma, mode="constant", cval=0) * alpha
x, y, z = np.meshgrid(np.arange(shape[0]), np.arange(shape[1]), np.arange(shape[2]))
#print(x.shape, y.shape, z.shape)
#print(dx.shape, dy.shape)
#x, y, z = x[:28, :28, :3], y[:28, :28, :3], z[:28, :28, :3]
#dx, dy = dx[:28, :28, :3], dy[:28, :28, :3]
indices = np.reshape(y+dy, (-1, 1)), np.reshape(x+dx, (-1, 1)), np.reshape(z, (-1, 1))
distored_image = map_coordinates(image, indices, order=1, mode='reflect')
return distored_image.reshape(shape)
def center(data):
# Inverse black and white
wb_data = np.ones(data.shape) * 255 - data
# normalize
prob_data = wb_data / np.sum(wb_data)
# marginal distributions
dx = np.sum(prob_data, (1, 2))
dy = np.sum(prob_data, (0, 2))
# expected values
(X, Y, Z) = prob_data.shape
cx = np.sum(dx * np.arange(X))
cy = np.sum(dy * np.arange(Y))
# Check bounds
assert cx > X/4 and cx < 3 * X/4, f"ERROR: {cx} > {X/4} and {cx} < {3 * X/4}"
assert cy > Y/4 and cy < 3 * Y/4, f"ERROR: {cy} > {Y/4} and {cy} < {3 * Y/4}"
# print('Center', cx, cy)
x_min = int(round(cx - X/4))
x_max = int(round(cx + X/4))
y_min = int(round(cy - Y/4))
y_max = int(round(cy + Y/4))
return data[x_min:x_max, y_min:y_max, :]
def create_transformed_digit(digit:int, size:float, rotation:float, family:str):
fig = plt.figure(figsize=(2,2), dpi=28)
fig.text(0.4, 0.4, str(digit), size=size, rotation=rotation, family=family)
# Rm axes, draw and get the rgba shape of the digit
plt.axis('off')
fig.canvas.draw()
data = np.frombuffer(fig.canvas.tostring_argb(), dtype=np.uint8)
data = data.reshape(fig.canvas.get_width_height()[::-1] + (4,))
# Convert to rgb
data = convert_to_rgb(data)
# Center the data
data = center(data)
# Apply an elastic deformation
data = elastic_transform(data, alpha=991, sigma=9)
# Free memory space
plt.close(fig)
return data
def save_dataset(dataset_name:str, array_X:np.array, array_y:np.array):
with open(f'{dataset_name}.pkl', 'wb') as output:
dataset = array_X, array_y
pickle.dump(dataset, output)
def build_dataset(C:dict, std_size=2.5):
"""build a dataset with `dataset_size` according to the chosen font
and deformation. Only digits in `datasets_digits` are in the created
dataset."""
numbers_str="".join([str(n) for n in C['numbers']])
file_name=f"{C['font']}_{numbers_str}_{C['n_samples']}_{C['tilt']}_{C['seed']}"
if os.path.isfile(f"{file_name}.pkl"):
return pickle.load(open(f"{file_name}.pkl", "rb"))
if C['seed']: np.random.seed(C['seed'])
#Make a plot of each original digit to know what they look like
# show_original_font(C['font'])
list_X = []
list_y= []
for i in range(C['n_samples']):
if i%10 == 0: print(round(i / C['n_samples'] * 100), '%')
X = np.zeros((3, 28, 28 ))
#Choosing a number at this step and its transformation characteristics
digit = C["numbers"][np.random.randint(len(C["numbers"]))]
for j, tilt in enumerate(C['tilt']):
rotation = tilt + np.random.normal(0, C['std_tilt'])
size = 60 + np.random.normal(0, std_size)
X_tilt=create_transformed_digit(digit, size, rotation, C['font'])
X[j] = X_tilt[:, :, j]
# Append data to the datasets
#list_X.append(X[:,:,0])
list_X.append(X)
list_y.append([digit])
#save the dataset
dataset = (np.array(list_X), np.array(list_y))
pickle.dump(dataset, open(f'{file_name}.pkl', 'wb'))
return np.array(list_X), np.array(list_y)
class Ds_MNIST_modified(Dataset):
"""Creation of the dataset used to create the clients' dataloader"""
def __init__(self, features, labels):
self.features = features
self.labels = labels
def __len__(self): return len(self.features)
def __getitem__(self,idx):
#3D input 1x28x28
sample_x = torch.Tensor(self.features[idx])
sample_y = self.labels[idx]
return sample_x, sample_y
def plot_samples(self, channel:int, title=None, plot_name="",
n_examples =20):
n_rows = int(n_examples / 5)
plt.figure(figsize=(1* n_rows, 1*n_rows))
if title: plt.suptitle(title)
for idx in range(n_examples):
X, y = self[idx]
ax = plt.subplot(n_rows, 5, idx + 1)
image = 255 - X.view((-1, 28, 28))[channel]
ax.imshow(image, cmap='gist_gray')
ax.axis("off")
if plot_name!="":plt.savefig(f"plots/"+plot_name+".png")
plt.tight_layout()
def get_synth_MNIST(clients, batch_size:int, shuffle=True):
"""function returning a list of training and testing dls."""
list_train, list_test = [], []
for C in clients:
X, y = build_dataset(C)
X = (255 - X) /255
X_train, y_train = X[:C['n_samples_train']], y[:C['n_samples_train']]
X_test, y_test = X[C['n_samples_train']:], y[C['n_samples_train']:]
train_ds = Ds_MNIST_modified(X_train, y_train)
train_dl = DataLoader(train_ds, batch_size = batch_size, shuffle = shuffle)
list_train.append(train_dl)
test_ds = Ds_MNIST_modified(X_test, y_test)
test_dl = DataLoader(test_ds, batch_size = batch_size, shuffle = shuffle)
list_test.append(test_dl)
return list_train, list_test
\ No newline at end of file
source diff could not be displayed: it is too large. Options to address this: view the blob.
This diff is collapsed.