Commits · 0db1db2417f3b6df3edca1cb9adce2ce59ff386f · Machine learning for population genetics / private / dnadna

Jul 30, 2021
- Merge branch 'embray/packaging/pypi' into 'master' · 0db1db24
  E Madison Bray authored 3 years ago
  
  Prepare package for pypi release See merge request !91
  1.0.0rc0
  
  0db1db24
- [packaging] add license files and their associated copyright notices · 5bfae7da
  E Madison Bray authored 3 years ago
  
  Mention licenses in the classifiers in setup.cfg (a classifier for CeCILL-C has just recently been added :) [skip ci]
  5bfae7da
- [packaging] update development status to something a bit more reassuring · f21742b6
  E Madison Bray authored 3 years ago
  
  also fix minimum Python version which I believe must be 3.7 now
  f21742b6
- [packaging] add some additional metadata bits to setup.cfg · 79abbc83
  E Madison Bray authored 3 years ago
  
  omits licensing information for now pending discussion on #91
  79abbc83
- Merge branch 'flora/documentation/overview_network' into 'master' · 0475ac08
  E Madison Bray authored 3 years ago
  
  Flora/documentation/overview network See merge request !120
  0475ac08
- [documentation] change potentially confusing verbiage about "datasets" · b5bd9735
  E Madison Bray authored 3 years ago
  
  b5bd9735
- [documentation] misc minor nitpicks · 3d95d116
  E Madison Bray authored 3 years ago
  
  3d95d116
- Update overview.rst - summarize the prediction part · 7466e46e
  Jérémy Guez authored 3 years ago and E Madison Bray committed 3 years ago
  
  7466e46e
- Update prediction.rst : minor changes · f953e9b9
  Jérémy Guez authored 3 years ago and E Madison Bray committed 3 years ago
  
  f953e9b9
- Update prediction.rst · ba8ac4a2
  Jérémy Guez authored 3 years ago and E Madison Bray committed 3 years ago
  
  ba8ac4a2
- [documentation] minor updates, mostly for spelling/formatting · 15a59790
  E Madison Bray authored 3 years ago
  
  15a59790
- first version of completed overview · 502a4c39
  Flora Jay authored 3 years ago and E Madison Bray committed 3 years ago
  
  502a4c39
- mention of filenam_format propertie · c9c5dfbb
  Flora Jay authored 3 years ago and E Madison Bray committed 3 years ago
  
  c9c5dfbb
- start working on overview · 14774d9d
  Flora Jay authored 3 years ago and E Madison Bray committed 3 years ago
  
  14774d9d
Jul 29, 2021

Merge branch 'embray/enhancement/improved-shutdown-handling' into 'master' · c0a45224
E Madison Bray authored 3 years ago
```
Improved interrupt handling for dnadna train

See merge request !127
```
c0a45224
Merge branch 'embray/missing-default-simulator-seed' into 'master' · c430d099
E Madison Bray authored 3 years ago
```
Add a default of `seed: null` for the simulator config in the schema

See merge request !126
```
c430d099
[bug] add a default of `seed: null` for the simulator config in the · d973556e
E Madison Bray authored 3 years ago
```
schema

fixes the issue raised at !123 (comment 551445)
```
d973556e

Slightly better output surrounding the progress bar. · 9c2da6e9

E Madison Bray authored 3 years ago

The message displayed when pausing training is now on a separate line
from the progress bar(s).

Unfortunately tqdm does not have a public method to get all active
progress bars, so their displays can be cleared.  It would be better if
they were hidden outright or something, and then when resuming training
could be redrawn again in the same place.  Currently this is difficult
to do with tqdm.  I have ideas for a workaround but it's not worth
spending a lot of time on.

Also handle when the user hits Ctrl-D while suspended.

9c2da6e9

[enhancement] improved interrupt handling for dnadna train · 17904150

E Madison Bray authored 3 years ago

Previously, trying to interrupt `dnadna train` (e.g. with Ctrl-C)
could often take several tries, and would result in a bunch of messy
tracebacks (often overlapping each other due to tracebacks from
interrupted worker processes)

This now handles some interrupts more cleanly.

In particular, pressing Ctrl-C does two things:

1) Rather than immediately interrupting the training, it simply pauses
   it.  Pressing Enter resumes the training, and pressing Ctrl-C again
   cancels it.

2) When canceling the training, it attempts to shut down gracefully:
   If in a validation pass it interrupts the validation, and if in a
   training pass it interrupts the training loop and tries to exit
   cleanly.  This is not always 100% guarantee as not all code is
   interrupt-safe, but it will be more rare to get a non-clean
   interrupt.

   In a follow-up, we could also choose to save a checkpoint right
   when interrupted.

Likewise, trying to terminate the process will attempt a clean shutdown.

17904150

Merge branch 'embray/documentation-simulators' into 'master' · b034e3f9
E Madison Bray authored 3 years ago
```
Improved Simulator documentation

See merge request !88
```
b034e3f9

Jul 28, 2021
- [testing] since !122 , Command.main never raises exceptions raised by · 2960a0eb
  E Madison Bray authored 3 years ago
  
  commands' implementations Instead it prints/logs them. For testing purposes it's better to be able to catch and check the original exceptions, so a raise_exceptions flag is added to Command.main
  2960a0eb
- [documentation] add missing import and some typo fixes · a4f68ff6
  E Madison Bray authored 3 years ago
  
  a4f68ff6
- [bug] rewrite this portion of the example code in such a way as to work · d5efa567
  E Madison Bray authored 3 years ago
  
  around this pandas/numpy bug: https://github.com/pandas-dev/pandas/issues/39520 If I understand correctly the bug only occurs when initializing an "empty" DataFrame (even if the index is not empty). Instead we construct a dict of the columns first, and then initialize the DataFrame from this dict. That should avoid triggering this bug.
  d5efa567
- "population change event" -> "population size change event" · ec82b5f4
  E Madison Bray authored 3 years ago
  
  ec82b5f4
- [documentation] Minor typos · 7ba739c9
  j.guez authored 3 years ago and E Madison Bray committed 3 years ago
  
  7ba739c9
- [documentation] finish writing the initial version of the simulation · af9077a0
  E Madison Bray authored 3 years ago
  
  documentation, including the tutorial on writing a custom simulator The code in this documentation has been hand-tested but is not automatically tested. That will be a task for the future.
  af9077a0
- Fix some minor Simulator bugs encountered while working on updating the · a4473722
  E Madison Bray authored 3 years ago
  
  documentation.
  a4473722
- [documentation] WIP on the improved Simulator documentation · 7bdb385f
  E Madison Bray authored 3 years ago
  
  7bdb385f
- [refactoring] remove the confusing DefaultSimulator that is not actually · db3ca0fe
  E Madison Bray authored 3 years ago
  
  usable. Since !74 the note in its docstring about providing default templates is no longer valid either. Its existence is only likely to confuse users, since it cannot be run.
  db3ca0fe
- Merge branch 'embray/simulator/scenario-params-overwrite' into 'master' · d7997b2d
  E Madison Bray authored 3 years ago
  
  Start to address #84 See merge request !117
  d7997b2d
Jul 27, 2021

Merge branch 'embray/error-formatting' into 'master' · 5a40893b
E Madison Bray authored 3 years ago
```
Slightly improved error formatting

See merge request !122
```
5a40893b
Merge branch 'embray/testing/fix-random-seed-test-again' into 'master' · dab6b0ae
E Madison Bray authored 3 years ago
```
Fix the test_random_seed test on CUDA again

See merge request !124
```
dab6b0ae

[bug] don't create backup directory or display misleading log messages · 45f43c62

E Madison Bray authored 3 years ago

when using --overwrite instead of --backup

The fact that the same method is being used both for --backup and
--overwrite is a bit messy, but there's enough overlap in their
functionality that I keep it as is for now.

45f43c62

Merge branch 'embray/documentation/random-seed' into 'master' · d17c5ba2
E Madison Bray authored 3 years ago
```
[documentation] remove note about random seed in training docs

See merge request !125
```
d17c5ba2
[documentation] remove note about random seed in training docs · 0b4f1b75
E Madison Bray authored 3 years ago
```
This warning is no longer applicable since I fixed it in !123

[skip ci]
```
0b4f1b75
[testing] minor test fix and whitespace fix · 4edd6b43
E Madison Bray authored 3 years ago

4edd6b43

[testing] fix the test_random_seed test on CUDA again · 24d32a38

E Madison Bray authored 3 years ago

Since merging !72 this seems to fail randomly a lot for SPIDNA.  It
didn't fail on the MR for some reason, but due to its non-deterministic
nature (especially when the tests are run in parallel) there could be
some other minor influence causing it to fail more often now that it's
merged.

24d32a38

Merge branch 'embray/issue-109' into 'master' · ef6304ce
E Madison Bray authored 3 years ago
```
[bug] get rid of all default seeds in different configuration sources

See merge request !123
```
ef6304ce

Jul 26, 2021

[bug] get rid of all default seeds in different configuration sources · 23eb4340

E Madison Bray authored 3 years ago

There are currently 3 "seed" options:

1) A "seed" for simulations
2) A "seed" for preprocessing (this mostly controls randomization of
dataset splits)
3) A "seed" for training

All of these had default values in the default config files.  I think
partly as an artifact of when I ported over some of Jean and Theophile's
old config files into the code.

In practice, users should expect stochasticity by default.  The schemas
have a default value of "null" for all these seeds, which is equivalent
to random seeding of the PNRG.  If users want to set a specific seed for
reproducibility they should do so manually.

(Possible future enhancement: Record the seed that was used so they can
reproduce the same run even if the seed was not set explicitly first.)

23eb4340

Merge branch 'jcury/documentation/training' into 'master' · de5f7c6e
JAY Flora authored 3 years ago
```
[doc] start work on training doc

See merge request !95
```
de5f7c6e

Admin message