Add local DP to declearn
Summary
The goal here is to modify Declearn to implement DP. This requires to implement 4 things:
- Noise addition, done through the
GaussianNoiseModule
- Per sample norm clipping, already implemented
- Poisson sampling, added to
Dataset
- Adding a privacy accountamt, imported from
Opacus
and added to the client side
Main changes :
-
DPFederatedServer
, adding amake_private
method to send privacy instructions to server -
DPFederatedClient
, adding amake_private
method to create and accountant and make the optimizer private -
DPOPtimizer
, a child ofOptimizer
that implements noise addition - Added
GaussianNoiseModule
,PrivacyRequest
andPrivacyConfig
, and modified training
To do
Must have
-
'Solve' convergence issues -
Complete documentation -
Fix noise_multiplier
mistake -
Unit testing of noise -
Deal with Opacus dependencies at install -
Understand what happens with none steps -
Add privacy budget to constraints rather than just logging > implement explicit budget limit. To the best of my knowledge, not done in Opacus -
Check client selection for privacy request > risk of missing some
Nice to have
- More flexible privacy budgetting :
-
Add dictionnary option for per-client privacy budget -
Allow the user to give noise multiplier directly rather than a privacy budget
-
- Safe_mode and creating random Vectors
-
Safe_mode for Poisson sampling -
Implementing random vector creation at the base class level (Vector) and use framework relevant generators
-
- Refine clipping
-
Implement per layer clipping (see Opacus.optimizers.perlayeroptimizer.py
) -
Adaptive clipping (see Opacus.optimizers.adaclipoptimizer.py
)
-
Structural changes / open questions
-
Opacus check on several passes and compatible layers .e.g. privacy_engine.forbid_accumulation_hook -
Decide on where whether to keep subclassing DPObjects -
Explore noise as an aux_var to allow for time varying noise addition -
Should more things be class attributes in FedClient - optim, accountant -
Explore creating the private oprtimizer on the server side and only sending back elements data size
Longer term
- DO we need a distributed training version of the DP mechanisms ?
- Do we need to add DP for other information exchanges, such as validation (see https://core.ac.uk/download/pdf/328855894.pdf)
Done
-
Exchange privacy info -
Create privacy messages -
Client -
Make private in Client -
Optimizer -
Accounatant
-
-
At train, -
increment history -
Send reply
-
-
-
FedServer -
Make private message -
At train
-
-
-
Add poisson_sampling
flag toDataset
-
Add noise -
Add noise
module to the available modules -
Add noise instructions mechanism to Optimizer
-
-
Exchange privacy budget info -
Expose on server and client side -
Add to communication channels
-
Edited by BIGAUD Nathan