From e7fae4b5f9c39d7cdd4fc18dcb06e2ee80918e55 Mon Sep 17 00:00:00 2001 From: Paul Andrey <paul.andrey@inria.fr> Date: Mon, 19 Feb 2024 10:54:56 +0100 Subject: [PATCH] Update online documentation to reflect API changes. - Document changes in the registration and initialization processes. - Document changes of the Optimizer AuxVar API. - Document requirement for peers using the same DecLearn version. --- docs/setup.md | 12 ++++++++++ docs/user-guide/fl_process.md | 42 ++++++++++++++++++++++------------- docs/user-guide/optimizer.md | 32 +++++++++++++------------- 3 files changed, 54 insertions(+), 32 deletions(-) diff --git a/docs/setup.md b/docs/setup.md index 7b00c0aa..0ae2ead3 100644 --- a/docs/setup.md +++ b/docs/setup.md @@ -1,5 +1,17 @@ # Installation guide +This guide provides with all the required information to install `declearn`. + +**TL;DR**:<br/> +If you want to install the latest stable version with all of its optional +dependencies, simply run `pip install declearn[all]` from your desired +python (preferably virtual) environment. + +**Important note**:<br/> +When running a federated process with DecLearn, the server and all clients +should use the same `major.minor` version; otherwise, clients' registration +will fail verbosely, prompting to install the same version as the server's. + ## Requirements - python >= 3.8 diff --git a/docs/user-guide/fl_process.md b/docs/user-guide/fl_process.md index 85f9fd1d..2e68a465 100644 --- a/docs/user-guide/fl_process.md +++ b/docs/user-guide/fl_process.md @@ -11,8 +11,10 @@ exposed here. ## Overall process orchestrated by the server - Initially: - - have the clients connect and register for training - - prepare model and optimizer objects on both sides + - the clients connect to the server and register for training + - the server may collect targetted metadata from clients when required + - the server sets up the model, optimizers, aggregator and metrics + - all clients receive instructions to set up these objects as well - Iteratively: - perform a training round - perform an evaluation round @@ -36,25 +38,35 @@ exposed here. registered, optionally under a given timeout delay) - close registration (reject future requests) - Client: - - gather metadata about the local training dataset - (_e.g._ dimensions and unique labels) - - connect to the server and send a request to join training, - including the former information + - connect to the server and send a request to join training - await the server's response (retry after a timeout if the request came in too soon, i.e. registration is not opened yet) -- messaging : (JoinRequest <-> JoinReply) ### Post-registration initialization +#### (Optional) Metadata exchange + +- This step is optional, and depends on the trained model's requirement + for dataset information (typically, features shape and/or dtype). +- Server: + - query clients for targetted metadata about the local training datasets +- Client: + - collect and send back queried metadata +- messaging: (MetadataQuery <-> MetadataReply) +- Server: + - validate and aggregate received information + - pass it to the model so as to finalize its initialization + +#### Initialization of the federated optimization problem + - Server: - - validate and aggregate clients-transmitted metadata - - finalize the model's initialization using those metadata - - send the model, local optimizer and evaluation metrics specs to clients + - set up the model, local and global optimizer, aggregator and metrics + - send specs to the clients so that they set up local counterpart objects - Client: - - instantiate the model, optimizer and metrics based on server instructions -- messaging: (InitRequest <-> GenericMessage) + - instantiate the model, optimizer, aggregator and metrics based on specs +- messaging: (InitRequest <-> InitReply) -### (Optional) Local differential privacy setup +#### (Optional) Local differential privacy setup - This step is optional; a flag in the InitRequest at the previous step indicates to clients that it is to happen, as a secondary substep. @@ -91,8 +103,8 @@ exposed here. - Server: - select clients that are to participate - - send data-batching parameters and shared model trainable weights - - (_send effort constraints, unused for now_) + - send data-batching parameters and effort constraints + - send shared model trainable weights - Client: - update model weights - perform evaluation steps based on effort constraints diff --git a/docs/user-guide/optimizer.md b/docs/user-guide/optimizer.md index 103ceb1d..a57dda61 100644 --- a/docs/user-guide/optimizer.md +++ b/docs/user-guide/optimizer.md @@ -280,7 +280,7 @@ To implement Scaffold in Declearn, one needs to set up both server-side and client-side OptiModule plug-ins. The client-side module is in charge of both correcting input gradients and computing the required quantities to update the states at the end of each training round, while the server-side module merely -manages the computation and distribution of the global and correction states. +manages the computation and distribution of the global referencestate. The following snippet sets up a pair of client-side and server-side optimizers that implement Scaffold, here with a 0.001 learning rate on the client side and @@ -447,18 +447,17 @@ Declearn introduces the notion of "auxiliary variables" to cover such cases: - The packaging and distribution of module-wise auxiliary variables is done by `Optimizer.collect_aux_var` and `process_aux_var`, which orchestrate calls to the plugged-in modules' methods of the same name. -- The management and compartementalization of client-wise auxiliary variables - information is also automated as part of `declearn.main.FederatedServer`, to - prevent information leakage between clients. +- Exchanged information is formatted via dedicated `AuxVar` data structures + (inheriting `declearn.optimizer.module.AuxVar`), that define how to aggregate + peers' data, and indicate how to use secure aggregation on top of it (when it + is possible to do so). #### OptiModule and Optimizer auxiliary variables API At the level of any `OptiModule`: -- `OptiModule.collect_aux_var` should output a dict that may either have a - simple `{key: value}` structure (for server-purposed or shared-across-clients - information), or a nested `{client_name: {key: value}}` structure (that is to - be split in order to send distinct information to the clients). +- `OptiModule.collect_aux_var` should output either `None` or an instance of + a module-specific `AuxVar` subclass wrapping data to be shared. - `OptiModule.process_aux_var` should expect a dict that has the same structure as that emitted by `collect_aux_var` (of this module class, or of a @@ -466,11 +465,12 @@ At the level of any `OptiModule`: At the level of a wrapping `Optimizer`: -- `Optimizer.collect_aux_var` emits a `{module_aux_name: module_emitted_dict}` - dict. +- `Optimizer.collect_aux_var` outputs a `{module_aux_name: module_aux_var}` + dict to be shared. -- `Optimizer.process_aux_var` expects a `{module_aux_name: module_emitted_dict}` - dict as well. +- `Optimizer.process_aux_var` expects a `{module_aux_name: module_aux_var}` + dict as well, containing either server-emitted or aggregated clients-emitted + data. As a consequence, you should note that: @@ -478,10 +478,8 @@ As a consequence, you should note that: that have the same `name` or `aux_name`. - If you are using our `Optimizer` within your own orchestration code (_i.e._ outside of our `FederatedServer` / `FederatedClient` main classes), it is up - to you to handle the restructuration of auxiliary variables to ensure that - (a) each client gets its own information (and not that of others), and that - (b) client-wise auxiliary variables are concatenated properly for the - server-side optimizer to process. + to you to handle the aggregation of client-wise auxiliary variables into the + module-wise single instance that the server should receive. #### Integration to the Declearn FL process @@ -651,5 +649,5 @@ In some cases, you might want to clip your batch-averaged gradients, _e.g._ to prevent exploding gradients issues. This is possible in Declearn, thanks to a couple of `OptiModule` subclasses: `L2Clipping` (name: `'l2-clipping'`) clips arrays of weights based on their L2-norm, while `L2GlobalClipping` (name: -`'l2-global-clipping`) clips all weights based on their global L2-norm (as if +`'l2-global-clipping'`) clips all weights based on their global L2-norm (as if concatenated into a single array). -- GitLab