diff --git a/README.md b/README.md new file mode 100644 index 0000000000000000000000000000000000000000..dc65218c5b0fca1136a71ee5bbb81ad237009a15 --- /dev/null +++ b/README.md @@ -0,0 +1,227 @@ +# H-REVOLVE: Program reversals with multiple checkpoint levels + +Authors: Julien Herrmann, Guillaume Aupy +Licence: See file LICENSE + +## Abstract: + +This is the library described in the paper "H-Revolve: A Framework for Adjoint Computation on Synchronous Hierarchical Platforms" by Herrmann and Pallez [0]. + +In this code we propose the code for the implementation of a checkpointing strategy for adjoint computation in the presence of multiple levels of storage: + - a fast and limited in size storage (e.g. memory), + - a slow and larger in size storage (e.g. disks), + - etc where the larger the storage, the slower the access. + +Details of the assumption taken (write/read costs etc), of what the code does/will do are available in: + - "H-Revolve: A Framework for Adjoint Computation on Synchronous Hierarchical Platforms" by Herrmann and Pallez [0] + - "Optimal Multi-stage algorithm for Adjoint Computation" by Aupy, Herrmann, Hovland and Robert [1] + - "Periodicity in optimal hierarchical checkpointing schemes for adjoint computations" by Aupy and Herrmann [2]. +These works are available in the folder "references". + +## Installation + +``` +pip install hrevolve +``` + +## Inputs + +The algorithms take the following inputs: + + | input | Signification | + |--:|--| + | [l] | the length of the graph to be differentiated | + | [cm] | the number of forward time steps that can be stored in memory | + | [uf] | the time to perform a forward time step (Default: 1) | + | [ub] | the time to perform a backward time step (Default: 1) | + | [wd] | the time to store a forward time step to disk (Default: 5) | + | [rd] | the time to retrieve a forward time step from disk (Default: 5) | + +Not all algorithms take all inputs. + +## Algorithms + +### Single Level algorithms +[*Revolve*] Revolve is the implementation of the single-level dynamic programming presented in [3], where the costs of read and write are considered equal to 0. It provides the user a schedule that minimizes the execution time to reverse the graph. +Utilisation is: +```console +$ Revolve [l] [cm] --uf [uf] --ub [ub] +``` +Examples: +```console +$ Revolve 100 8 --uf 1.2 --ub 3 +$ Revolve 20 2 +``` + +### Two Level algorithms +The implementations of this section considered a specific model with two levels of storage: + - a fast (read and write = 0) and limited in size storage (e.g. memory), + - a slow and unbounded in size storage (e.g. disks), + +[*Disk-Revolve*] Disk-Revolve is the implementation of the two-level dynamic programming presented in [1]. It provides the user a schedule that minimizes the execution time to reverse the graph. + +Utilisation is: +```console +$ Disk-Revolve [l] [cm] --uf [uf] --ub [ub] --wd [wd] --rd [rd] +``` +Examples: +```console +$ Disk-Revolve 100 8 --uf 1.2 --ub 3 --wd 8 --rd 2.5 +$ Disk-Revolve 20 2 +``` + +[*Periodic-Disk-Revolve*] Periodic-Disk-Revolve returns a schedule where the number of forward steps between two consecutive disk checkpoints is constant. The schedule provided is shown to be asymptotically optimal in [2]. In addition, a conjecture was made w.r.t the value of the optimal period that Periodic-Disk-Revolve should use. The `--fast` option allows to compute this value with a much lower complexity than [Periodic-Disk-Revolve] (constant time instead of O(l^3)). +Utilisation is: +```console +$ Periodic-Disk-Revolve [l] [cm] --uf [uf] --ub [ub] --wd [wd] --rd [rd] --fast +``` +Examples: +```console +$ Periodic-Disk-Revolve 100 3 --uf 1.2 --ub 3 --wd 8 --rd 2.5 +$ Periodic-Disk-Revolve 200 8 +``` + +### Multilevel algorithms + +[*H-revolve*] H-revolve is the algorithm described in Herrmann and Pallez [0]. This returns a schedule for an architecture in which there are a number of levels of storage locations with the higher numbered layers typically larger but more expensive to access. +Utilisation is: +```console +$ H-Revolve [l] [file_name] --uf [uf] --ub [ub] +``` +Where `filename` is the name of a file encoding the architecture. This file contains a line consisting of a single integer which is the number of storage levels. It then contains a line for each storage level. Each storage level line comprises three numbers separated by whitespace. These are the number of storage slots at that level, the cost of writing one timestep to that level, and the cost of reading one timestep from that level. Comment lines starting with `#` are also permitted. An example file would have the contents: +``` +3 +1 0 0 +3 2 2 +10 5 5 +``` + + +* TODO: + - [*Online-Disks*] In [2], it was shown that one can compute an asymptotically optimal online solution (size of adjoint is not known before-hand) using the optimal period provided by [Periodic-Disk-Revolve]. + + +## Output + +The following outputs are provided: + - [Sequence:] the full schedule following a grammar: + +| Operation | Action | +|--:|---| +| [F_i] | Executes the i forward operation | +| [F_i->j] | Executes the i, i+1, ..., j-1, j forward operations | +| [B_i] | Executes the x backward operation | +| [WD_i] | Writes the output of the (i-1) forward operation to disk | +| [RD_i] | Reads the output of the (i-1) forward operation from disk | +| [WM_i] | Writes the output of the (i-1) forward operation to memory | +| [RM_i] | Reads the output of the (i-1) forward operation from memory | +| [DM_i] | Discards the output of the (i-1) forward operation from memory | + + - [Memory:] lists all the outputs of forward operations that were stored on memory, in the order that they were saved to memory + - [Disk:] lists all the outputs of forward operations that were stored on memory, in the order that they were saved to memory + - [Makespan:] Provides the makespan in the schedule. Note that with the options `--uf 1 --ub 0` it gives the number of recomputations of forward steps (as discussed in [3]). + - [Period size:] ([Periodic-Disk-Revolve] only) the optimal number of forward steps there should be between two consecutive disk checkpoints. This is the period to use in the online case. + +## Result Formats + +There is an option to give more compact format of the Sequence output. The option is `--concat` and its value [val] can take the following values: 0 (default), 1, 2 or 3. + +Utilisation is: +```console +$ Rev-Revolve [l] [cm] --uf [uf] --ub [ub] --wd [wd] --rd [rd] --concat [val] +``` +Examples: +```console +$ Disk-Revolve 100 3 --uf 1.2 --ub 3 --wd 8 --rd 2.5 --concat 3 +$ Disk-Revolve 200 8 --concat 1 +``` + +[--concat 0]: The default option gives the entire schedule. +```console +$ Disk-Revolve 15 2 --concat 0 +``` +returns: +``` +Sequence: [WD_0, F_0->5, WM_6, F_6->11, WM_12, F_12, F_13, F_14, B_15, RM_12, F_12, F_13, B_14, RM_12, F_12, B_13, RM_12, B_12, DM_12, RM_6, F_6->8, WM_9, F_9, F_10, B_11, RM_9, F_9, B_10, RM_9, B_9, DM_9, RM_6, F_6->6, WM_7, F_7, B_8, RM_7, B_7, DM_7, RM_6, B_6, DM_6, RD_0, WM_0, F_0->2, WM_3, F_3, F_4, B_5, RM_3, F_3, B_4, RM_3, B_3, DM_3, RM_0, F_0->0, WM_1, F_1, B_2, RM_1, B_1, DM_1, RM_0, B_0, DM_0] +``` + +[--concat 1]: Factorizes the subschedule that are computed by a Revolve algorithm. +```console +$ Disk-Revolve 15 2 --concat 0 +``` +returns: +``` +Sequence: [WD_0, F_0->5, Revolve(9, 2), RD_0, Revolve(5, 2)] +``` + +Note that to obtain a _defactorized_ version of Revolve(9,2) (for instance), one can call the subprogram `$ Revolve 9 2`. + +[--concat 2]: Factorizes the subschedules that are computed by a 1D-Revolve algorithm (see [1]). +```console +$ Disk-Revolve 15 2 --concat 0 +``` +returns: +``` +Sequence: [WD_0, F_0->5, Revolve(9, 2), RD_0, 1D-Revolve(5, 2)] +``` +Note that to obtain a _defactorized_ version of 1D-Revolve(5,2) (for instance), one can call the subprogram `$ 1D-Revolve 5 2`. + +[--concat 3]: To understand this last option, one needs to go back to what a schedule looks like [1,2]. A schedule provided by Disk-Revolve is always of the form: *Forward sweep; Turn; Backward Sweep*. + +To explain this schedule, we will use the example of `Disk-Revolve 100 4` which under the option `--concat 2` returns: +``` +[WD_0, F_0->14, WD_15, F_15->29, WD_30, F_30->44, WD_45, F_45->59, WD_60, F_60->74, Revolve(25, 4), RD_60, 1D-Revolve(14, 4), RD_45, 1D-Revolve(14, 4), RD_30, 1D-Revolve(14, 4), RD_15, 1D-Revolve(14, 4), RD_0, 1D-Revolve(14, 4)] +``` +- The *forward sweep* is a sequence of Disk writes followed by forward operations. + Eg: for `Disk-Revolve 100 4` -> (WD_0, F_0->14, WD_15, F_15->29, WD_30, F_30->44, WD_45, F_45->59, WD_60, F_60->74) +- The *Turn* is a Revolve function. + Eg: for `Disk-Revolve 100 4` -> Revolve(25, 4) +- The *backward sweep* is a sequence of read disks followed by 1D-Revolve operations. + Eg: for `Disk-Revolve 100 4` -> RD_60, 1D-Revolve(14, 4), RD_45, 1D-Revolve(14, 4), RD_30, 1D-Revolve(14, 4), RD_15, 1D-Revolve(14, 4), RD_0, 1D-Revolve(14, 4) + +In [2], we denote by the term "periods", the number of forward operations between two writes on disks. Finally the idea of the last option is to give: the sizes of the different periods [m1], ..., [mk] and the length of the turn [tn], and is returned under the form: ([m1], [m2], .., [mk]; [tn]). To obtain a _defactorized_ version of it, we start by a WD_0, then F_0->[m1-1], then WD_[m1] etc. +```console +$ Disk-Revolve 100 4 --concat 3 +``` +returns: +``` +Sequence: (15, 15, 15, 15, 15; 26) +``` + +## Library interface + +The various checkpointing algorithms are also available as library calls in the +`hrevolve` package. With the exception of HRevolve, these take the positional +arguments `l` and `cm` which are integers with the same meaning as above. +Additional configuration parameters can be set using keyword arguments. The +result is a `Sequence` object encoding the schedule. For example: + +```python +In [1]: from hrevolve import disk_revolve + +In [2]: print(disk_revolve(15, 2, concat=2)) +[WD_0, F_0->5, Revolve(9, 2), RD_0, 1D-Revolve(5, 2)] +``` + +The library interface to HRevolve takes the positional arguments `l`, `cvect`, +`wvect`, and `rvect`. `l` is the length of the graph. The other arguments are +vectors indicating, respectively, the number of slots at each level of memory +and the cost of writing and reading a slot at each level of memory. For +example: + +```python +In [1]: from hrevolve import hrevolve + +In [2]: print(hrevolve(20, [1, 2, 10], [0, 2, 3], [0, 2, 3], concat=2)) +[W^2_0, F_0->6, HRevolve_1(13, 2), R^2_0, F_0->2, HRevolve_1(3, 2), R^2_0, HRevolve_1(2, 2)] +``` + +BIBLIOGRAPHY + +[0] Herrmann, Pallez, "H-Revolve: A Framework for Adjoint Computation on Synchronous Hierarchical Platforms", ACM Transactions on Mathematical Software 46(2), 2020. + +[1] Aupy, Herrmann, Hovland and Robert, "Optimal Multi-stage algorithm for Adjoint Computation", Siam Journal on Scientific Computing, 38(3), 2016. + +[2] Aupy and Herrmann, "Periodicity in optimal hierarchical checkpointing schemes for adjoint computations", Optimization Methods and Software, 32(3): 594-624 , 2017. + +[3] Griewank and Walther, "Algorithm 799: revolve: an implementation of checkpointing for the reverse or adjoint mode of computational differentiation." ACM Transactions on Mathematical Software (TOMS) 26.1 (2000): 19-45. diff --git a/hrevolve/hrevolve.py b/hrevolve/hrevolve.py index 0cf0a0377a66ef11a09b2771129a5b62796a627c..2e09613b9942f6fd7fe5cf670555cb10a8fda260 100755 --- a/hrevolve/hrevolve.py +++ b/hrevolve/hrevolve.py @@ -89,11 +89,12 @@ def HRevolve_aux(l, K, cmem, cvect, wvect, rvect, hoptp=None, hopt=None, **param jmin = argmin(list_mem) sequence.insert(Operation("Forwards", [0, jmin - 1])) sequence.insert_sequence( - hrevolve_recurse(l - jmin, 0, cmem - 1, cvect, wvect, rvect, hoptp=hoptp, hopt=hopt).shift(jmin) + hrevolve_recurse(l - jmin, 0, cmem - 1, cvect, wvect, rvect, + hoptp=hoptp, hopt=hopt, **params).shift(jmin) ) sequence.insert(Operation("Read", [0, 0])) sequence.insert_sequence( - HRevolve_aux(jmin - 1, 0, cmem, cvect, wvect, rvect, uf, + HRevolve_aux(jmin - 1, 0, cmem, cvect, wvect, rvect, hoptp=hoptp, hopt=hopt, **params) ) return sequence diff --git a/pyproject.toml b/pyproject.toml index fb76fc26e6aaa9f9d96fc140d49f69e4535fc674..4f0936c831b736664a95294beba64a014d4b3e37 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta" [project] name = "hrevolve" -version = "0.9.0" +version = "0.9.2" authors = [ { name="Guillaume Pallez (Aupy)", email="guillaume.pallez@inria.fr" }, { name="Julien Herrmann", email="julien.herrmann@inria.fr"} @@ -13,7 +13,7 @@ maintainers = [ { name="David A. Ham", email="david.ham@imperial.ac.uk"} ] description = "An implementation of HRevolve and related checkpointing algorithms." -readme = "Readme.md" +readme = "README.md" requires-python = ">=3.7" classifiers = [ "Programming Language :: Python :: 3",