Mentions légales du service

Skip to content
Snippets Groups Projects
Verified Commit 7715ab91 authored by ANDREY Paul's avatar ANDREY Paul
Browse files

Write up release notes for 'v2.4.0'.

parent e7fae4b5
No related branches found
No related tags found
1 merge request!63Finalize DecLearn v2.4.0.
- [v2.4.0](v2.4.0.md)
- [v2.3.2](v2.3.2.md)
- [v2.3.1](v2.3.1.md)
- [v2.3.0](v2.3.0.md)
......
# declearn v2.4.0
Released: XX/XX/XXXX
**Important notice**:<br/>
DecLearn 2.4 derogates to SemVer by revising some of the major DecLearn
component APIs.
This is mitigated in two ways:
- No changes relate to the main process setup code, meaning that end-users
that do not use custom components (aggregator, optimodule, metric, etc.)
should not see any difference, and their code will work as before (as an
illustration, our examples' code remains unchanged).
- Key methods that were deprecated in favor of new API ones are kept for
two more minor versions, and are still tested to work as before.
Any end-user encountering issues due to the released or planned evolutions of
DecLearn is invited to contact us via GitLab, GitHub or e-mail so that we can
provide with assistance and/or update our roadmap so that changes do not hinder
the usability of DecLearn 2.x for research and applications.
## New version policy (and future roadmap)
As noted above, v2.4 does not fully abide by SemVer rules. In the future, more
partially-breaking changes and API revisions may be introduced, incrementally
paving the way towards the next major release, while trying as much as possible
not to break end-user code.
To avoid unforeseen incompatibilities and cryptic bugs from arsing, from this
version onwards, the server and clients are expected and verified to use the
same `major.minor` version of DecLearn.
This policy may be updated in the future, e.g. to specify that clients may
have a newer minor version than the server (and most probably not the other
way around).
To avoid unhappy surprises, we are starting to maintain a public roadmap on
our GitLab. Although it may change, it should provide interested users (notably
those that are interested in developing custom components or processes on top
of DecLearn) with a way to anticipate changes, and voice any concerns or advice
they might have.
## Revise all aggregation APIs
### Revise the overal design for aggregation and introduce `Aggregate` API
This release introduces the `Aggregate` API, which is based on an abstract base
dataclass acting as a template for data structures that require sharing across
peers and aggregation.
The `declearn.utils.Aggregate` ABC acts as a shared ancestor providing with
a base API and shared backend code to define data structures that:
- are serializable to and deserializable from JSON, and may therefore be
preserved across network communications
- are aggregatable into an instance of the same structure
- use summation as the default aggregation rule for fields, which is
overridable by redefining the `default_aggregate` method
- can implement custom `aggregate_<field.name>` methods to override the
default summation rule
- implement a `prepare_for_secagg` method that
- enables defining which fields merely require sum-aggregation and need
encryption when using SecAgg, and which fields are to be preserved in
cleartext (and therefore go through the usual default or custom
aggregation methods)
- can be made to raise a `NotImplementedError` when SecAgg cannot be
achieved on a data structure
This new ABC currently has three main children:
- `AuxVar`: replaces plain dict for `Optimizer` auxiliary variables
- `MetricState`: replaces plain dict for `Metric` intermediate states
- `ModelUpdates`: replaces sharing of updates as `Vector` and `n_steps`
Each of this is defined jointly with another (pre-existing, revised) API for
components that (a) produce `Aggregate` data structures based on some input
data and/or computations; (b) produce some output results based on a received
`Aggregate` structure, meant to result from the aggregation of multiple peers'
produced data.
### Revise `Aggregator` API, introducing `ModelUpdates`
The `Aggregator` API was revised to make use of the new `ModelUpdates` data
structure (inheriting `Aggregate`).
- `Aggregator.prepare_for_sharing` pre-processes an input `Vector` containing
raw model updates and an integer indicating the number of local SGD steps
into a `ModelUpdates` structure.
- `Aggregator.finalize_updates` receives a `ModelUpdates` resulting from the
aggregation of peers' instances, and performs final computations to produce
a `Vector` of aggregated model updates.
- The legacy `Aggregator.aggregate` method is deprecated (but still works).
### Revise auxiliary variables for `Optimizer`, introducing `AuxVar`
The `OptiModule` API (and, consequently, `Optimizer`) was revised as to the
design and signature of auxiliary variables related methods, to make use of
the new `AuxVar` data structure (inheriting `Aggregate`).
- `OptiModule.collect_aux_var` now emits either `None` or an `AuxVar` instance
(the precise type of which is module-dependent), instead of a mere dict.
- `OptiModule.process_aux_var` now expects a proper-type `AuxVar` instance
that _already_ aggregates clients' data, externalizing the aggregation rules
to the `AuxVar` subtypes, while keeping the finalization logic part of the
`OptiModule` subclasses.
- `Optimizer.collect_aux_var` therefore emits a `{name: aux_var}` dict.
- `Optimizer.process_aux_var` therefore expects a `{name: aux_var}` dict,
rather than having distinct signatures on the client and server sides.
- It is now expected that server-side components will send the _same_ data
to all clients, rather than allow sending client-wise values.
The backend code of `ScaffoldClientModule` and `ScaffoldServerModule` was
heavily revised to alter the distribution of information and computations:
- Client-side modules are now the sole owners of their local state, and send
sum-aggregatable updates to the server, that are therefore SecAgg-compatible.
- The server consequently shares the same information with all clients, namely
the current global state.
- To keep track of the (possibly growing with time) number of unique client,
clients generate a random uuid that is sent with their state updates and
preserved in cleartext when SecAgg is used.
- As a consequence, the server component knows which clients contributed to a
given round, but receives an aggregate of local updates rather than the
client-wise state values.
### Revise `Metric` API, introducing `ModelState`
The `Metric` API was revised to make use of the new `MetricState` data
structure (inheriting `Aggregate`).
- `Metric.build_initial_states` generates a "zero-state" `MetricState` instance
(it replaces the previously-private `_build_states` method that returned a
dict).
- `Metric.get_states` returns a (Metric-type-dependent) `MetricState`
instance, instead of a mere dict.
- `Metric.set_states` assigns an incoming `MetricState` into the instance, that
may be finalized into results using the unchanged `get_result` method.
- The legacy `Metric.agg_states` is deprecated, in favor of `set_states` (but
it still works).
## Revise backend communications and messaging APIs
This release introduces some important backend changes to the communication
and messaging APIs of DecLearn, resulting in more robust code (that is also
easier to test and maintain), more efficient message parsing (possibly-costly
de-serialization is now delayed to a time posterior to validity verification)
and the extensibility of application messages, enabling to easily define and
use custom message structures in downstream applications.
The most important API change is that network communication endpoints now
return `SerializedMessage` instances rather than `Message` ones.
### New `declearn.communication.api.backend` submodule
- Introduce a new `ActionMessage` minimal API under its `actions`
submodule, that defines hard-coded, lightweight and easy-to-parse
data structures designed to convey information and content across
network communications agnostic to the content's nature.
- Revise and expose the `MessagesHandler` util, that now builds on
the `ActionMessage` API to model remote calls and answer them.
- Move the `declearn.communication.messaging.flags` submodule to
`declearn.communication.api.backend.flags`.
### New `declearn.messaging` submodule
- Revise the `Message` API to make it extendable, with automated
type-registration of subclasses by default.
- Introduce `SerializedMessage` as a wrapper for received messages,
that parses the exact `Lessage` subtype (enabling logic tests and
message filtering) but delays actual content de-serialization and
`Message` object recovery (enabling to prevent undue resources use
for unwanted messages that end up being discarded).
- Move most existing `Message` subclasses to the new submodule, for
retro-compatibility purposes. In DecLearn 3.0 these will probably
be re-dispatched to make it clear that concrete messages only make
sense in the context of specific multi-agent processes.
- Drop backend-oriented `Message` subclasses that are replaced with
the new `ActionMessage` backbone structures.
- Deprecate the `declearn.communication.messaging` submodule, that
is temporarily maintained, re-exporting moved contents as well as
deprecated message types (which are bound to be rejected if sent).
### Revise `NetworkClient` and `NetworkServer`
- Have message-receiving methods return `SerializedMessage` instances
rather than finalized de-serialized `Message` ones.
- Quit sending and expecting 'data_info' with registration requests.
- Rename `NetworkClient.check_message` into `recv_message` (keep
the former as an alias, with a `DeprecationWarning`).
- Improve the use of (optional) timeouts when sending or expecting
messages and overall exceptions handling:
- `NetworkClient.recv_message` may either raise a `TimeoutError`
(in case of timeout) or `RuntimeError` (in case of rejection).
- `NeworkServer.send_messages` and `broadcast_message` quietly
stop waiting for clients to collect messages after the (opt.)
timeout delay has passed. Messages may still be collected.
- `NetworkServer.wait_for_messages` no longer accepts a timeout.
- `NetworkServer.wait_for_messages_with_timeout` implements the
possibility to setup a timeout. It returns both received client
replies and a list of clients that failed to answer.
- All timeouts can now be specified as float values (which is
mostly useful for testing purposes or simulated environments).
- Add a `heartbeat` instantiation parameter, with a default value of
1 second, that is passed to the underlying `MessagesHandler`. In
simulated contexts (including tests), setting a low heartbeat can
cut runtime down significantly.
## Usability updates
A few minor changes were made in hope that they can improve DecLearn usability
for end-users.
### Record and save training losses
The `Model` API was updated so that `Model.compute_batch_gradients` now records
the computed batch-averaged model loss as a float value in an internal buffer,
and the newly-introduced `Model.collect_training_losses` method enables getting
all stored values (and purging the buffer on the way).
The `FederatedClient` was consequently updated to collect and export training
loss values at the end of each and every training round when a `Checkpointer`
is attached to it (otherwise, values are purged from memory but not recorded to
disk).
### Add a verbosity option for `FederatedClient` and `TrainingManager`.
`FederatedClient` and `TrainingManager` now both accept a `verbose: bool=True`
instantiatation keyword argument, that changes:
- (a) the default logger verbosity level: if `logger=None` is also passed,
the default logger will have 'info'-level verbosity if `verbose` and
`declearn.utils.LOGGING_LEVEL_MAJOR`-level if `not verbose`, so that only
evaluation metrics and errors are logged.
- (b) the optional display of a progressbar when conducting training or
evaluation rounds; if `not verbose`, no progressbar is used.
### Add 'TrainingManager.(train|evaluate)_under_constraints' to the API.
These public methods enable running training or evaluation rounds without
relying on `Message` structures to specify parameters nor collect results.
### Modularize 'run_as_processes' input specs.
The `declearn.utils.run_as_processes` util was modularized to that routines
can be specified in various ways. Previously, they could only be passed as
`(func, args)` tuples. Now, they can either be passed as `(func, args)`,
`(func, kwargs)` or `(func, args, kwargs)`, where `args` is still a tuple
of positional arguments, and `kwargs` is a dict of keyword ones.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment