diff --git a/declearn3.md b/declearn3.md new file mode 100644 index 0000000000000000000000000000000000000000..c64d7a10f8fc727a49ca87ff2bb61743897d72ab --- /dev/null +++ b/declearn3.md @@ -0,0 +1,40 @@ +* Integrate Fairness-aware methods (update branch, merge) (2 weeks) + +* Add Analytics (API + processes + SecAgg (incl. Metrics)) (1 Month) + + +* Refactor routines + - NOTES: + - Write some structure to handle Model + Optimizer(s) + Aggregator + - Write logic for setup (duplicate from Server to Client / Peer to Peer) + - Write logic for local training (provided data is available) + - Write logic for aggregation (then, modularize to support Gossip, etc.) + +* Revise serialization (move to Msgpack ; revise custom code ; screen perfs) (1 week) + + +--- + + +* Configuration tools => improve usability and extensibility ; see with Rosalie's work + +* Interface and use FLamby (for examples and/or benchmarks) + -> See with Paul & Edwige + +* Nouveaux algos + - Personalization via Hybrid training (2 weeks) + - Things with Rosalie? + +* Profile performances (benchmark: with asv or, easier, using current logging) + +* Revise Network Communication: + - Modularize timeout on responses => go minimal on that + - Enable connection loss/re-connection => not now, wait for tests / actual problems + - Improve the way clients are identified by MessagesHandler? => test again for issues (see if required/interesting) + - Improve the tackling of MessagesHandler receiving multiple messages from or for the same client? => wait for roadmap on decentralized + +* Add client sampling + +* Split NetworkServer from FederatedServer + +* (Later) Quickrun mode revision diff --git a/docs/user-guide/fl_process.md b/docs/user-guide/fl_process.md index 6972092b14e41f2196f6978a66b0702759549a68..60ed43839dee1df8f8d394e35f2aa9f5dfa1ae16 100644 --- a/docs/user-guide/fl_process.md +++ b/docs/user-guide/fl_process.md @@ -15,13 +15,17 @@ exposed here. - the server may collect targetted metadata from clients when required - the server sets up the model, optimizers, aggregator and metrics - all clients receive instructions to set up these objects as well + - additional setup phases optionally occur to set up advanced features + (secure aggregation, differential privacy and/or group fairness) - Iteratively: + - (optionally) perform a fairness-related round - perform a training round - - perform an evaluation round + - (optionally) perform an evaluation round - decide whether to continue, based on the number of rounds taken or on the evolution of the global loss - Finally: - - restore the model weights that yielded the lowest global loss + - (optionally) evaluate the last model, if it was not already done + - restore the model weights that yielded the lowest global validation loss - notify clients that training is over, so they can disconnect and run their final routine (e.g. save the "best" model) - optionally checkpoint the "best" model @@ -65,6 +69,8 @@ for dataset information (typically, features shape and/or dtype). - send specs to the clients so that they set up local counterpart objects - Client: - instantiate the model, optimizer, aggregator and metrics based on specs + - verify that (optional) secure aggregation algorithm choice is coherent + with that of the server - messaging: (InitRequest <-> InitReply) #### (Optional) Local differential privacy setup @@ -79,15 +85,84 @@ indicates to clients that it is to happen, as a secondary substep. - adjust the training process to use sample-wise gradient clipping and add gaussian noise to gradients, implementing the DP-SGD algorithm - set up a privacy accountant to monitor the use of the privacy budget -- messaging: (PrivacyRequest <-> GenericMessage) +- messaging: (PrivacyRequest <-> PrivacyReply) + +#### (Optional) Fairness-aware federated learning setup + +This step is optional; a flag in the InitRequest at a previous step +indicates to clients that it is to happen, as a secondary substep. + +See our [guide on Fairness](./fairness.md) for further details on +what (group) fairness is and how it is implemented in DecLearn. + +When Secure Aggregation is to be used, it is also set up as a first step +to this routine, ensuring exchanged values are protected when possible. + +- Server: + - send hyper-parameters to set up a controller for fairness-aware + federated learning +- Client: + - set up a controller based on the server-emitted query + - send back sensitive group definitions +- messaging: (FairnessSetupQuery <-> FairnessGroups) +- Server: + - define a sorted list of sensitive group definitions across clients + and share it with clients + - await associated sample counts from clients and (secure-)aggregate them +- Client: + - await group definitions and send back group-wise sample counts +- messaging: (FairnessGroups <-> FairnessCounts) +- Server & Client: run algorithm-specific additional setup steps, that + may have side effects on the training data, model, optimizer and/or + aggregator; further communication may occur. + +### (Optional) Secure Aggregation setup + +When configured to be used, Secure Aggregation may be set up any number of +times during the process, as fresh controllers will be required each and +every time the participating clients to a round differs from those chosen +at the previous round. + +By default however, all clients participate to each and every round, so +that a single setup will occur early in the overall FL process. + +See our [guide on Secure Aggregation](./secagg.md) for further details on +what secure aggregation is and how it is implemented in DecLearn. + +- Server: + - send an algorithm-specific SecaggSetupQuery message to selected clients + - trigger an algorithm-dependent setup routine +- Client: + - parse the query and execute the associated setup routine +- Server & Client: perform algorithm-dependent computations and communication; + eventually, instantiate and assign respective encryption and decryption + controllers. +- messaging: (SecaggSetupQuery <-> (algorithm-dependent Message)) + +### (Optional) Fairness round + +This round only occurs when a fairness controller was set up, and may be +configured to be periodically skipped. +If fairness is set up, the first fairness round will always occur. +If checkpointing is set up on the server side, the last model will undergo +a fairness round, to evaluate its fairness prior to ending the FL process. + +- Server: + - send a query to clients, including computational effort constraints, + and current shared model weights (when not already held by clients) +- Client: + - compute metrics that account for the fairness of the current model +- messaging: (FairnessQuery <-> FairnessReply) +- Server & Client: take any algorithm-specific additional actions to alter + training based on the exchanged values; further, communication may happen. ### Training round - Server: - select clients that are to participate - send data-batching and effort constraints parameters - - send shared model trainable weights and (opt. client-specific) optimizer - auxiliary variables + - send current shared model trainable weights (to clients that do not + already hold them) and optimizer auxiliary variables (if any) - Client: - update model weights and optimizer auxiliary variables - perform training steps based on effort constraints @@ -101,7 +176,11 @@ indicates to clients that it is to happen, as a secondary substep. - run global updates through the server's optimizer to modify and finally apply them -### Evaluation round +### (Optional) Evaluation round + +This round may be configured to be periodically skipped. +If checkpointing is set up on the server side, the last model will always be +evaluated prior to ending the FL process. - Server: - select clients that are to participate diff --git a/docs/user-guide/package.md b/docs/user-guide/package.md index c976351fb3b83e07972f8113b526d09477e46525..5f7c690902f07649a2828d71da400436002d1fcb 100644 --- a/docs/user-guide/package.md +++ b/docs/user-guide/package.md @@ -12,6 +12,8 @@ The package is organized into the following submodules:   Tools to write and extend shareable metadata fields specifications. - `dataset`:<br/>   Data interfacing API and implementations. +- `fairness`:<br/> + Processes and components for fairness-aware federated learning. - `main`:<br/>   Main classes implementing a Federated Learning process. - `messaging`:<br/> @@ -24,6 +26,8 @@ The package is organized into the following submodules:   Framework-agnostic optimizer and algorithmic plug-ins API and tools. - `secagg`:<br/>   Secure Aggregation API, methods and utils. +- `training`:<br/> + Model training and evaluation orchestration tools. - `typing`:<br/>   Type hinting utils, defined and exposed for code readability purposes. - `utils`:<br/> @@ -270,6 +274,44 @@ You may learn more about our (non-abstract) `Optimizer` API by reading our and is about making the class JSON-serializable). - To avoid it, use `class MyClass(SecureAggregate, register=False)`. +### Fairness + +#### `FairnessFunction` +- Import: `declearn.fairness.api.FairnessFunction` +- Object: Define a group-fairness criterion. +- Usage: Compute fairness levels of a model based on group-wise accuracy. +- Examples: + - `declearn.fairness.core.DemographicParityFunction` + - `declearn.fairness.core.EqualizedOddsFunction` +- Extend: + - Simply inherit from `FairnessFunction` (registration is automated). + - To avoid it, use `class MyClass(FairnessFunction, register=False)`. + +#### `FairnessControllerServer` +- Import: `declearn.fairness.api.FairnessControllerServer` +- Object: Define server-side routines to monitor and enforce fairness. +- Usage: modify the federated optimization algorithm; orchestrate fairness + rounds to measure the trained model's fairness level and adjust training + based on it. +- Examples: + - `declearn.fairness.fairgrad.FairgradControllerServer` + - `declearn.fairness.fairbatch.FairbatchControllerServer` +- Extend: + - Simply inherit from `FairnessControllerServer` (registration is automated). + - To avoid it, use `class MyClass(FairnessControllerServer, register=False)`. + +#### `FairnessControllerClient` +- Import: `declearn.fairness.api.FairnessControllerClient` +- Object: Define client-side routines to monitor and enforce fairness. +- Usage: modify the federated optimization algorithm; measure a model's + local fairness level; adjust training based on server-emitted values. +- Examples: + - `declearn.fairness.fairgrad.FairgradControllerClient` + - `declearn.fairness.fairbatch.FairbatchControllerClient` +- Extend: + - Simply inherit from `FairnessControllerClient` (registration is automated). + - To avoid it, use `class MyClass(FairnessControllerClient, register=False)`. + ## Full API Reference The full API reference, which is generated automatically from the code's diff --git a/docs/user-guide/usage.md b/docs/user-guide/usage.md index 6b05d75a11bc12683a088c2c2f1cb09ea695ccf4..9392b9cdca362f0dbd03583e2400df65ea28a9b4 100644 --- a/docs/user-guide/usage.md +++ b/docs/user-guide/usage.md @@ -32,7 +32,9 @@ details on this example and on how to run it, please refer to its own used by clients to derive local step-wise updates from model gradients. - Similarly, parameterize an `Optimizer` to be used by the server to (optionally) refine the aggregated model updates before applying them. - - Wrap these three objects into a `declearn.main.config.FLOptimConfig`, + - Optionally, parametrize a `FairnessControllerServer`, defining an + algorithm to enforce fairness constraints to the model being trained. + - Wrap these objects into a `declearn.main.config.FLOptimConfig`, possibly using its `from_config` method to specify the former three components via configuration dicts rather than actual instances. - Alternatively, write up a TOML configuration file that specifies these @@ -56,6 +58,9 @@ details on this example and on how to run it, please refer to its own defines metrics to be computed by clients on their validation data. - Optionally provide the path to a folder where to write output files (model checkpoints and global loss history). + - Optionally parameterize and provide with a `SecaggConfigServer` or its + configuration, to set up and use secure aggregation for all quantities + that support it (model weights, metrics and metadata). - Instantiate a `declearn.main.config.FLRunConfig` to specify the process: - Maximum number of training and evaluation rounds to run. - Registration parameters: exact or min/max number of clients to have @@ -63,11 +68,16 @@ details on this example and on how to run it, please refer to its own - Training parameters: data-batching parameters and effort constraints (number of local epochs and/or steps to take, and optional timeout). - Evaluation parameters: data-batching parameters and effort constraints - (optional maximum number of steps (<=1 epoch) and optional timeout). + (optional maximum number of steps (<=1 epoch) and optional timeout); + optional frequency (to only evaluate after every N training rounds). - Early-stopping parameters (optionally): patience, tolerance, etc. as to the global model loss's evolution throughout rounds. - Local Differential-Privacy parameters (optionally): (epsilon, delta) budget, type of accountant, clipping norm threshold, RNG parameters. + - Fairness evaluation parameters (optionally): computational constraints + and optionally frequency of fairness rounds; only used if fairness is + set up in the `FLOptimConfig`, and automatically/dynamically-filled if + left untouched. - Alternatively, write up a TOML configuration file that specifies all of the former hyper-parameters. - Call the server's `run` method, passing it the former config object, @@ -113,6 +123,9 @@ details on this example and on how to run it, please refer to its own concerns. - Optionally provide the path to a folder where to write output files (model checkpoints and local loss history). + - Optionally parameterize and provide with a `SecaggConfigClient` or its + configuration, to set up and use secure aggregation for all quantities + that support it (model weights, metrics and metadata). - Call the client's `run` method and let the magic happen. ## Logging