Commit ea334d26 authored by David Ham's avatar David Ham
Browse files

Merge remote-tracking branch 'origin/master' into python_package

parents 8e36b8c9 e7bee64f
# H-REVOLVE: Program reversals with multiple checkpoint levels
Authors: Julien Herrmann, Guillaume Aupy
Licence: See file LICENSE
## Abstract:
This is the library described in the paper "H-Revolve: A Framework for Adjoint Computation on Synchronous Hierarchical Platforms" by Herrmann and Pallez [0].
In this code we propose the code for the implementation of a checkpointing strategy for adjoint computation in the presence of multiple levels of storage:
- a fast and limited in size storage (e.g. memory),
- a slow and larger in size storage (e.g. disks),
- etc where the larger the storage, the slower the access.
Details of the assumption taken (write/read costs etc), of what the code does/will do are available in:
- "H-Revolve: A Framework for Adjoint Computation on Synchronous Hierarchical Platforms" by Herrmann and Pallez [0]
- "Optimal Multi-stage algorithm for Adjoint Computation" by Aupy, Herrmann, Hovland and Robert [1]
- "Periodicity in optimal hierarchical checkpointing schemes for adjoint computations" by Aupy and Herrmann [2].
These works are available in the folder "references".
## Requirements
TODO
## Installation
```
pip install git+https://gitlab.inria.fr/adjoint-computation/H-Revolve.git
```
## Main Algorithms:
### INPUTS
The algorithms take into input:
| input | Signification |
|--:|--|
| [l] | the length of the graph to be differentiated |
| [cm] | the number of forward time steps that can be stored in memory |
| [uf] | the time to perform a forward time step (Default: 1) |
| [ub] | the time to perform a backward time step (Default: 1) |
| [wd] | the time to store a forward time step in disks (Default: 5) |
| [rd] | the time to retrieve a forward time step from disks (Default: 5) |
### ALGORITHMS
#Single Level algorithms
- [*Revolve*] Revolve is the implementation of the single-level dynamic programming presented in [3], where the costs of read and write are considered equal to 0. It provides the user a schedule that minimizes the execution time to reverse the graph.
Utilisation is:
`$ Revolve [l] [cm] --uf [uf] --ub [ub] `
Examples:
`$ Revolve 100 8 --uf 1.2 --ub 3`
`$ Revolve 20 2`
#Two Level algorithms
The implementations of this section considered a specific model with two levels of storage:
- a fast (read and write = 0) and limited in size storage (e.g. memory),
- a slow and unbounded in size storage (e.g. disks),
- [*Disk-Revolve*] Disk-Revolve is the implementation of the two-level dynamic programming presented in [1]. It provides the user a schedule that minimizes the execution time to reverse the graph.
Utilisation is:
`$ Disk-Revolve [l] [cm] --uf [uf] --ub [ub] --wd [wd] --rd [rd]`
Examples:
`$ Disk-Revolve 100 8 --uf 1.2 --ub 3 --wd 8 --rd 2.5`
`$ Disk-Revolve 20 2`
- [*Periodic-Disk-Revolve*] Periodic-Disk-Revolve returns a schedule where the number of forward steps between two consecutive disk checkpoints is constant. The schedule provided is shown to be asymptotically optimal in [2]. In addition, a conjecture was made w.r.t the value of the optimal period that Periodic-Disk-Revolve should use. The `--fast` option allows to compute this value with a much lower complexity than [Periodic-Disk-Revolve] (constant time instead of O(l^3)).
Utilisation is:
`$ Periodic-Disk-Revolve [l] [cm] --uf [uf] --ub [ub] --wd [wd] --rd [rd] --fast`
Examples:
`$ Periodic-Disk-Revolve 100 3 --uf 1.2 --ub 3 --wd 8 --rd 2.5`
`$ Periodic-Disk-Revolve 200 8`
- [*Rev-Revolve*] Rev-Revolve is a slight modification of Disk-Revolve, when during the reverse sweep, one uses Revolve instead of 1D-Revolve (see [1]). It returns the optimal strategy when each Disk checkpoint can only be read once (see [0] for the proof).
Utilisation is:
`$ Rev-Revolve [l] [cm] --uf [uf] --ub [ub] --wd [wd] --rd [rd]`
Examples:
`$ Rev-Revolve 100 3 --uf 1.2 --ub 3 --wd 8 --rd 2.5`
`$ Rev-Revolve 200 8`
*TODO:
- [*Online-Disks*] In [2], it was shown that one can compute an asymptotically optimal online solution (size of adjoint is not known before-hand) using the optimal period provided by [Periodic-Disk-Revolve].
- [*H-revolve*] Description and usage to write.
### OUTPUT
The following outputs are provided:
- [Sequence:] the full schedule following a grammar:
| Operation | Action |
|--:|---|
| [F_i] | Executes the i forward operation |
| [F_i->j] | Executes the i, i+1, ..., j-1, j forward operations |
| [B_i] | Executes the x backward operation |
| [WD_i] | Writes the output of the (i-1) forward operation to disk |
| [RD_i] | Reads the output of the (i-1) forward operation from disk |
| [WM_i] | Writes the output of the (i-1) forward operation to memory |
| [RM_i] | Reads the output of the (i-1) forward operation from memory |
| [DM_i] | Discards the output of the (i-1) forward operation from memory |
- [Memory:] lists all the outputs of forward operations that were stored on memory, in the order that they were saved to memory
- [Disk:] lists all the outputs of forward operations that were stored on memory, in the order that they were saved to memory
- [Makespan:] Provides the makespan in the schedule. Note that with the options `--uf 1 --ub 0` it gives the number of recomputations of forward steps (as discussed in [3]).
- [Period size:] ([Periodic-Disk-Revolve] only) the optimal number of forward steps there should be between two consecutive disk checkpoints. This is the period to use in the online case.
### Result Formats
There is an option to give more compact format of the Sequence output. The option is `--concat` and its value [val] can take the following values: 0 (default), 1, 2 or 3.
Utilisation is:
`$ Rev-Revolve [l] [cm] --uf [uf] --ub [ub] --wd [wd] --rd [rd] --concat [val]`
Examples:
`$ Disk-Revolve 100 3 --uf 1.2 --ub 3 --wd 8 --rd 2.5 --concat 3`
`$ Disk-Revolve 200 8 --concat 1`
- [--concat 0]: The default option gives the entire schedule.
`$ Disk-Revolve 15 2 --concat 0` returns:
> Sequence: [WD_0, F_0->5, WM_6, F_6->11, WM_12, F_12, F_13, F_14, B_15, RM_12, F_12, F_13, B_14, RM_12, F_12, B_13, RM_12, B_12, DM_12, RM_6, F_6->8, WM_9, F_9, F_10, B_11, RM_9, F_9, B_10, RM_9, B_9, DM_9, RM_6, F_6->6, WM_7, F_7, B_8, RM_7, B_7, DM_7, RM_6, B_6, DM_6, RD_0, WM_0, F_0->2, WM_3, F_3, F_4, B_5, RM_3, F_3, B_4, RM_3, B_3, DM_3, RM_0, F_0->0, WM_1, F_1, B_2, RM_1, B_1, DM_1, RM_0, B_0, DM_0]
- [--concat 1]: Factorizes the subschedule that are computed by a Revolve algorithm.
`$ Disk-Revolve 15 2 --concat 0` returns:
> Sequence: [WD_0, F_0->5, Revolve(9, 2), RD_0, Revolve(5, 2)]
Note that to obtain a _defactorized_ version of Revolve(9,2) (for instance), one can call the subprogram `$ Revolve 9 2`
- [--concat 2]: Factorizes the subschedules that are computed by a 1D-Revolve algorithm (see [1]).
`$ Disk-Revolve 15 2 --concat 0` returns:
> Sequence: [WD_0, F_0->5, Revolve(9, 2), RD_0, 1D-Revolve(5, 2)]
Note that to obtain a _defactorized_ version of 1D-Revolve(5,2) (for instance), one can call the subprogram `$ 1D-Revolve 5 2`
- [--concat 3]: To understand this last option, one needs to go back to what a schedule looks like [1,2]. A schedule provided by Disk-Revolve is always of the form: *Forward sweep; Turn; Backward Sweep*.
To explain this schedule, we will use the example of `Disk-Revolve 100 4` which under the option `--concat 2` returns:
> [WD_0, F_0->14, WD_15, F_15->29, WD_30, F_30->44, WD_45, F_45->59, WD_60, F_60->74, Revolve(25, 4), RD_60, 1D-Revolve(14, 4), RD_45, 1D-Revolve(14, 4), RD_30, 1D-Revolve(14, 4), RD_15, 1D-Revolve(14, 4), RD_0, 1D-Revolve(14, 4)]
- The *forward sweep* is a sequence of Disk writes followed by forward operations.
Eg: for `Disk-Revolve 100 4` -> (WD_0, F_0->14, WD_15, F_15->29, WD_30, F_30->44, WD_45, F_45->59, WD_60, F_60->74)
- The *Turn* is a Revolve function.
Eg: for `Disk-Revolve 100 4` -> Revolve(25, 4)
- The *backward sweep* is a sequence of read disks followed by 1D-Revolve operations.
Eg: for `Disk-Revolve 100 4` -> RD_60, 1D-Revolve(14, 4), RD_45, 1D-Revolve(14, 4), RD_30, 1D-Revolve(14, 4), RD_15, 1D-Revolve(14, 4), RD_0, 1D-Revolve(14, 4)
In [2], we denote by the term "periods", the number of forward operations between two writes on disks. Finally the idea of the last option is to give: the sizes of the different periods [m1], ..., [mk] and the length of the turn [tn], and is returned under the form: ([m1], [m2], .., [mk]; [tn]). To obtain a _defactorized_ version of it, we start by a WD_0, then F_0->[m1-1], then WD_[m1] etc.
`$ Disk-Revolve 100 4 --concat 3` returns:
> Sequence: (15, 15, 15, 15, 15; 26)
## Library interface
The various checkpointing algorithms are also available as library calls in the
`hrevolve` package. With the exception of HRevolve, these take the positional
arguments `l` and `cm` which are integers with the same meaning as above.
Additional configuration parameters can be set using keyword arguments. The
result is a `Sequence` object encoding the schedule. For example:
```python
In [1]: from hrevolve import disk_revolve
In [2]: print(disk_revolve(15, 2, concat=2))
[WD_0, F_0->5, Revolve(9, 2), RD_0, 1D-Revolve(5, 2)]
```
The library interface to HRevolve takes the positional arguments `l`, `cvect`,
`wvect`, and `rvect`. `l` is the length of the graph. The other arguments are
vectors indicating, respectively, the number of slots at each level of memory
and the cost of writing and reading a slot at each level of memory. For
example:
```python
In [1]: from hrevolve import hrevolve
In [2]: print(hrevolve(20, [1, 2, 10], [0, 2, 3], [0, 2, 3], concat=2))
[W^2_0, F_0->6, HRevolve_1(13, 2), R^2_0, F_0->2, HRevolve_1(3, 2), R^2_0, HRevolve_1(2, 2)]
```
BIBLIOGRAPHY
[0] Herrmann, Pallez, "H-Revolve: A Framework for Adjoint Computation on Synchronous Hierarchical Platforms", ACM Transactions on Mathematical Software 46(2), 2020.
[1] Aupy, Herrmann, Hovland and Robert, "Optimal Multi-stage algorithm for Adjoint Computation", Siam Journal on Scientific Computing, 38(3), 2016.
[2] Aupy and Herrmann, "Periodicity in optimal hierarchical checkpointing schemes for adjoint computations", Optimization Methods and Software, 32(3): 594-624 , 2017.
[3] Griewank and Walther, "Algorithm 799: revolve: an implementation of checkpointing for the reverse or adjoint mode of computational differentiation." ACM Transactions on Mathematical Software (TOMS) 26.1 (2000): 19-45.
......@@ -199,6 +199,9 @@ class Sequence:
else:
return self.concat_sequence(self.concat).__repr__()
def __iter__(self):
return iter(self.concat_sequence(self.concat))
def canonical(self):
if self.function.name == "Disk-Revolve":
concat = 2
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment