Mentions légales du service

Skip to content

Improved interrupt handling for dnadna train

Previously, trying to interrupt dnadna train (e.g. with Ctrl-C) could often take several tries, and would result in a bunch of messy tracebacks (often overlapping each other due to tracebacks from interrupted worker processes)

This now handles some interrupts more cleanly.

In particular, pressing Ctrl-C does two things:

  1. Rather than immediately interrupting the training, it simply pauses it. Pressing Enter resumes the training, and pressing Ctrl-C again cancels it.

  2. When canceling the training, it attempts to shut down gracefully: If in a validation pass it interrupts the validation, and if in a training pass it interrupts the training loop and tries to exit cleanly. This is not always 100% guarantee as not all code is interrupt-safe, but it will be more rare to get a non-clean interrupt.

    In a follow-up, we could also choose to save a checkpoint right when interrupted.

Likewise, trying to terminate the process will attempt a clean shutdown.

Merge request reports