Mentions légales du service

Skip to content

Quickrun mode

BIGAUD Nathan requested to merge experimental into develop

Summary

  1. Build a quickrun mode, with a simple example to start using the package. Two use cases:
  • A way to discover the codebase in a guided way
  • A config system for repeating experiments
  1. Make the quickstart section of the doc the sole entry point for a user to discover the lib, and improve the doc to be easier to use.

3.Introduce a data splitting utility

Done (detailed) :

v0

  • Run
    • Create main run file using multiprocessing
    • Make multiprocessing a main util and revamp for mac compatibility
  • TOML
    • Write the TOML file for MNIST
    • Add options in from_toml method of Config classes
  • Create a model file and MyModel in a clean way
  • Data loading : generalize split_data
    • Update the data loading function to follow our naming
    • remove csv export
    • Split the data between train and test
    • Generalize data loading function
    • Take out with to csv and data format of fedbiomed
    • Review the argparser
    • Check as_array accepts Sparse inputs > it does not
  • Add run from package using the main .toml project.scripts (source)

Prep Magnet au Lac

  • Finish Debug:
    • pytorch version of the model (data input shape issue)
  • Refine TOML and argparsers:
    • Provide a TOML instead of a full folder as input to quickrun
    • Add custom names for files in TOML (see if can pickup name from file name)
    • Add parsing of Client and Server kwargs in TOML
      • Checkpointer (incl single expe destination in toml)
      • metric
    • Genenerate client network from server TOML and deal with client name more elegantly
    • Remove argparser in _split_data
  • Dataset
    • Add shuffling in label split function
  • Test all options in config toml
    • If there is an existing data split, and you request a new one, check for the new one to decide if splitting is needed
    • Parse metrics
  • Simplify MNIST default model
  • Allow for empty fields in TOML
  • Improve doc (including toml comments)

Integrate feedback from Magnet au Lac:

  • Separate the data split functionality
  • Properly document quickrun
    • Split quickrun and example doc
    • Rewrite quickstart section
    • Write a 'how to use TOML' entry point (strat in quickstrat, move it if needed)
    • Do collab
    • Modify doc for full follow up (Should I use the original objects or the config ?) > No need
  • Clarify absolute vs relative path
  • Lint
  • Rebase -i

To do

P1

  • Drop full list of client names and infer ?
  • Update heart UCI to MNIST
  • Add global seed > opening a separate branch and MR
  • Update collab for with a more scalable solution and losing the reference to the experimental branch

P2

  • Checkpointer saves a lot of things by default - could trim it for a quickrun
  • Use "dataclass_from_func" in building my config for clarity of doc
  • Literal["iid", "labels", "biased"] not parsed in DataSplitConfig
  • Add torch and sklearn examples
  • Reintroduce DataSplitConfig (find a way to use Doc of parent class)
  • Remaining 'bug': Slow pytorch runtime
Edited by ANDREY Paul

Merge request reports