Update user documentation to cover recent API extensions.

- Add SecAgg, Fairness and evaluation-frequency options to the usage guide. - Add SecAgg and Fairness related phases to the FL process overview. - Add Fairness API to the package overview.

Update user documentation to cover recent API extensions.
31405424 · ANDREY Paul · 431cec6d · 31405424 · 31405424 · 31405424
Verified Commit 31405424 authored 9 months ago by ANDREY Paul
--- a/declearn3.md
+++ b/declearn3.md
+* Integrate Fairness-aware methods (update branch, merge) (2 weeks)
+* Add Analytics (API + processes + SecAgg (incl. Metrics)) (1 Month)
+* Refactor routines
+  - NOTES:
+    - Write some structure to handle Model + Optimizer(s) + Aggregator
+      - Write logic for setup (duplicate from Server to Client / Peer to Peer)
+      - Write logic for local training (provided data is available)
+      - Write logic for aggregation (then, modularize to support Gossip, etc.)
+* Revise serialization (move to Msgpack ; revise custom code ; screen perfs) (1 week)
+---
+* Configuration tools => improve usability and extensibility ; see with Rosalie's work
+* Interface and use FLamby (for examples and/or benchmarks)
+  -> See with Paul & Edwige
+* Nouveaux algos
+  - Personalization via Hybrid training (2 weeks)
+  - Things with Rosalie?
+* Profile performances (benchmark: with asv or, easier, using current logging)
+* Revise Network Communication:
+    - Modularize timeout on responses => go minimal on that
+    - Enable connection loss/re-connection => not now, wait for tests / actual problems
+    - Improve the way clients are identified by MessagesHandler? => test again for issues (see if required/interesting)
+    - Improve the tackling of MessagesHandler receiving multiple messages from or for the same client? => wait for roadmap on decentralized
+* Add client sampling
+* Split NetworkServer from FederatedServer
+* (Later) Quickrun mode revision
--- a/docs/user-guide/fl_process.md
+++ b/docs/user-guide/fl_process.md
@@ -15,13 +15,17 @@ exposed here.
    - the server may collect targetted metadata from clients when required
    - the server sets up the model, optimizers, aggregator and metrics
    - all clients receive instructions to set up these objects as well
+    - additional setup phases optionally occur to set up advanced features
+      (secure aggregation, differential privacy and/or group fairness)
 - Iteratively:
+    - (optionally) perform a fairness-related round
    - perform a training round
-    - perform an evaluation round
+    - (optionally) perform an evaluation round
    - decide whether to continue, based on the number of
      rounds taken or on the evolution of the global loss
 - Finally:
-    - restore the model weights that yielded the lowest global loss
+    - (optionally) evaluate the last model, if it was not already done
+    - restore the model weights that yielded the lowest global validation loss
    - notify clients that training is over, so they can disconnect
      and run their final routine (e.g. save the "best" model)
    - optionally checkpoint the "best" model
@@ -65,6 +69,8 @@ for dataset information (typically, features shape and/or dtype).
    - send specs to the clients so that they set up local counterpart objects
 - Client:
    - instantiate the model, optimizer, aggregator and metrics based on specs
+    - verify that (optional) secure aggregation algorithm choice is coherent
+      with that of the server
 - messaging: (InitRequest <-> InitReply)
 #### (Optional) Local differential privacy setup
@@ -79,15 +85,84 @@ indicates to clients that it is to happen, as a secondary substep.
    - adjust the training process to use sample-wise gradient clipping and
      add gaussian noise to gradients, implementing the DP-SGD algorithm
    - set up a privacy accountant to monitor the use of the privacy budget
- messaging: (PrivacyRequest <-> GenericMessage)
+- messaging: (PrivacyRequest <-> PrivacyReply)
+#### (Optional) Fairness-aware federated learning setup
+This step is optional; a flag in the InitRequest at a previous step
+indicates to clients that it is to happen, as a secondary substep.
+See our [guide on Fairness](./fairness.md) for further details on
+what (group) fairness is and how it is implemented in DecLearn.
+When Secure Aggregation is to be used, it is also set up as a first step
+to this routine, ensuring exchanged values are protected when possible.
+- Server:
+    - send hyper-parameters to set up a controller for fairness-aware
+      federated learning
+- Client:
+    - set up a controller based on the server-emitted query
+    - send back sensitive group definitions
+- messaging: (FairnessSetupQuery <-> FairnessGroups)
+- Server:
+    - define a sorted list of sensitive group definitions across clients
+      and share it with clients
+    - await associated sample counts from clients and (secure-)aggregate them
+- Client:
+    - await group definitions and send back group-wise sample counts
+- messaging: (FairnessGroups <-> FairnessCounts)
+- Server & Client: run algorithm-specific additional setup steps, that
+  may have side effects on the training data, model, optimizer and/or
+  aggregator; further communication may occur.
+### (Optional) Secure Aggregation setup
+When configured to be used, Secure Aggregation may be set up any number of
+times during the process, as fresh controllers will be required each and
+every time the participating clients to a round differs from those chosen
+at the previous round.
+By default however, all clients participate to each and every round, so
+that a single setup will occur early in the overall FL process.
+See our [guide on Secure Aggregation](./secagg.md) for further details on
+what secure aggregation is and how it is implemented in DecLearn.
+- Server:
+  - send an algorithm-specific SecaggSetupQuery message to selected clients
+  - trigger an algorithm-dependent setup routine
+- Client:
+  - parse the query and execute the associated setup routine
+- Server & Client: perform algorithm-dependent computations and communication;
+  eventually, instantiate and assign respective encryption and decryption
+  controllers.
+- messaging: (SecaggSetupQuery <-> (algorithm-dependent Message))
+### (Optional) Fairness round
+This round only occurs when a fairness controller was set up, and may be
+configured to be periodically skipped.
+If fairness is set up, the first fairness round will always occur.
+If checkpointing is set up on the server side, the last model will undergo
+a fairness round, to evaluate its fairness prior to ending the FL process.
+- Server:
+    - send a query to clients, including computational effort constraints,
+      and current shared model weights (when not already held by clients)
+- Client:
+    - compute metrics that account for the fairness of the current model
+- messaging: (FairnessQuery <-> FairnessReply)
+- Server & Client: take any algorithm-specific additional actions to alter
+  training based on the exchanged values; further, communication may happen.
 ### Training round
 - Server:
    - select clients that are to participate
    - send data-batching and effort constraints parameters
-    - send shared model trainable weights and (opt. client-specific) optimizer
+    - send current shared model trainable weights (to clients that do not
-      auxiliary variables
+      already hold them) and optimizer auxiliary variables (if any)
 - Client:
    - update model weights and optimizer auxiliary variables
    - perform training steps based on effort constraints
@@ -101,7 +176,11 @@ indicates to clients that it is to happen, as a secondary substep.
    - run global updates through the server's optimizer to modify and finally
      apply them
-### Evaluation round
+### (Optional) Evaluation round
+This round may be configured to be periodically skipped.
+If checkpointing is set up on the server side, the last model will always be
+evaluated prior to ending the FL process.
 - Server:
    - select clients that are to participate

--- a/docs/user-guide/package.md
+++ b/docs/user-guide/package.md
@@ -12,6 +12,8 @@ The package is organized into the following submodules:
  &emsp; Tools to write and extend shareable metadata fields specifications.
 - `dataset`:<br/>
  &emsp; Data interfacing API and implementations.
+- `fairness`:<br/>
+  Processes and components for fairness-aware federated learning.
 - `main`:<br/>
  &emsp; Main classes implementing a Federated Learning process.
 - `messaging`:<br/>
@@ -24,6 +26,8 @@ The package is organized into the following submodules:
  &emsp; Framework-agnostic optimizer and algorithmic plug-ins API and tools.
 - `secagg`:<br/>
  &emsp; Secure Aggregation API, methods and utils.
+- `training`:<br/>
+  Model training and evaluation orchestration tools.
 - `typing`:<br/>
  &emsp; Type hinting utils, defined and exposed for code readability purposes.
 - `utils`:<br/>
@@ -270,6 +274,44 @@ You may learn more about our (non-abstract) `Optimizer` API by reading our
      and is about making the class JSON-serializable).
    - To avoid it, use `class MyClass(SecureAggregate, register=False)`.
+### Fairness
+#### `FairnessFunction`
+- Import: `declearn.fairness.api.FairnessFunction`
+- Object: Define a group-fairness criterion.
+- Usage: Compute fairness levels of a model based on group-wise accuracy.
+- Examples:
+  - `declearn.fairness.core.DemographicParityFunction`
+  - `declearn.fairness.core.EqualizedOddsFunction`
+- Extend:
+  - Simply inherit from `FairnessFunction` (registration is automated).
+  - To avoid it, use `class MyClass(FairnessFunction, register=False)`.
+#### `FairnessControllerServer`
+- Import: `declearn.fairness.api.FairnessControllerServer`
+- Object: Define server-side routines to monitor and enforce fairness.
+- Usage: modify the federated optimization algorithm; orchestrate fairness
+  rounds to measure the trained model's fairness level and adjust training
+  based on it.
+- Examples:
+  - `declearn.fairness.fairgrad.FairgradControllerServer`
+  - `declearn.fairness.fairbatch.FairbatchControllerServer`
+- Extend:
+  - Simply inherit from `FairnessControllerServer` (registration is automated).
+  - To avoid it, use `class MyClass(FairnessControllerServer, register=False)`.
+#### `FairnessControllerClient`
+- Import: `declearn.fairness.api.FairnessControllerClient`
+- Object: Define client-side routines to monitor and enforce fairness.
+- Usage: modify the federated optimization algorithm; measure a model's
+  local fairness level; adjust training based on server-emitted values.
+- Examples:
+  - `declearn.fairness.fairgrad.FairgradControllerClient`
+  - `declearn.fairness.fairbatch.FairbatchControllerClient`
+- Extend:
+  - Simply inherit from `FairnessControllerClient` (registration is automated).
+  - To avoid it, use `class MyClass(FairnessControllerClient, register=False)`.
 ## Full API Reference
 The full API reference, which is generated automatically from the code's

--- a/docs/user-guide/usage.md
+++ b/docs/user-guide/usage.md
@@ -32,7 +32,9 @@ details on this example and on how to run it, please refer to its own
    used by clients to derive local step-wise updates from model gradients.
  - Similarly, parameterize an `Optimizer` to be used by the server to
    (optionally) refine the aggregated model updates before applying them.
-  - Wrap these three objects into a `declearn.main.config.FLOptimConfig`,
+  - Optionally, parametrize a `FairnessControllerServer`, defining an
+    algorithm to enforce fairness constraints to the model being trained.
+  - Wrap these objects into a `declearn.main.config.FLOptimConfig`,
    possibly using its `from_config` method to specify the former three
    components via configuration dicts rather than actual instances.
  - Alternatively, write up a TOML configuration file that specifies these
@@ -56,6 +58,9 @@ details on this example and on how to run it, please refer to its own
      defines metrics to be computed by clients on their validation data.
    - Optionally provide the path to a folder where to write output files
      (model checkpoints and global loss history).
+    - Optionally parameterize and provide with a `SecaggConfigServer` or its
+      configuration, to set up and use secure aggregation for all quantities
+      that support it (model weights, metrics and metadata).
  - Instantiate a `declearn.main.config.FLRunConfig` to specify the process:
    - Maximum number of training and evaluation rounds to run.
    - Registration parameters: exact or min/max number of clients to have
@@ -63,11 +68,16 @@ details on this example and on how to run it, please refer to its own
    - Training parameters: data-batching parameters and effort constraints
      (number of local epochs and/or steps to take, and optional timeout).
    - Evaluation parameters: data-batching parameters and effort constraints
-      (optional maximum number of steps (<=1 epoch) and optional timeout).
+      (optional maximum number of steps (<=1 epoch) and optional timeout);
+      optional frequency (to only evaluate after every N training rounds).
    - Early-stopping parameters (optionally): patience, tolerance, etc. as
      to the global model loss's evolution throughout rounds.
    - Local Differential-Privacy parameters (optionally): (epsilon, delta)
      budget, type of accountant, clipping norm threshold, RNG parameters.
+    - Fairness evaluation parameters (optionally): computational constraints
+      and optionally frequency of fairness rounds; only used if fairness is
+      set up in the `FLOptimConfig`, and automatically/dynamically-filled if
+      left untouched.
  - Alternatively, write up a TOML configuration file that specifies all of
    the former hyper-parameters.
  - Call the server's `run` method, passing it the former config object,
@@ -113,6 +123,9 @@ details on this example and on how to run it, please refer to its own
      concerns.
    - Optionally provide the path to a folder where to write output files
      (model checkpoints and local loss history).
+    - Optionally parameterize and provide with a `SecaggConfigClient` or its
+      configuration, to set up and use secure aggregation for all quantities
+      that support it (model weights, metrics and metadata).
  - Call the client's `run` method and let the magic happen.
 ## Logging