Mentions légales du service

Skip to content
Snippets Groups Projects

Flora/documentation/overview network

Merged JAY Flora requested to merge flora/documentation/overview_network into master
1 unresolved thread

partly addressing documentation sprint issue #83 regarding overview

Edited by E Madison Bray

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
23 and/or "untransform" the predicted parameters). Alternatively the final
24 config file of a run ``run_{runid}/my_model_run_{runid}_final_config.yml``
25 can be passed (in which case the best network of the given run is used by
26 default).
27
28 * INPUT: path to one or more npz files, or to a :ref:`dataset config file <dnadna-dataset-simulation-config>` (describing a whole dataset).
29
30
31 A typical usage will thus be:
32
33 .. code-block:: bash
34
35 $ dnadna predict run_{run_id}/my_model_run_{run_id}_best_net.pth realdata/dataset.npz
36
37 to classify/predict evolutionary parameters for a single dataset
38 ``realdata/dataset.npz`` in :doc:`DNADNA dataset format <datasets>`.
  • I'm a little confused by the intended meaning here. Currently (although we could and should) we do not define a format for storing an entire dataset in a single file. A single file can contain one SNP matrix + position array, so isn't it misleading to call this a "dataset"?

  • Author Owner

    yes you're right, for now the user has to pre-cut the genome into multiple pieces (corresponding to replicates), each saved into an .npz file I guess the definition of real dataset is changing depending on the context (for selection it could be indeed just a piece of a genome). Or people might in the fure built a network that scan the whole data store into one file. (in particular for species with tiny genomes). However I see that it can confuse people. Let's change this.

    @j.guez are you working more on it?

  • I agree, everything you write makes sense and I believe should be an option. It may also be useful for improving I/O performance on some distributed filesystems. But for now each file contains a single "datum" as it were.

  • @fjay No, I don't have anything to add, at least for the moment.

    Edited by Jérémy Guez
  • changed this line in version 8 of the diff

  • I changed this a bit in b5bd9735

  • Author Owner

    OK thanks ! So I think this can be merged

  • Please register or sign in to reply
  • E Madison Bray added 1 commit

    added 1 commit

    • 0f721dd9 - [documentation] misc minor nitpicks

    Compare with previous version

  • E Madison Bray added 70 commits

    added 70 commits

    • 0f721dd9...c0a45224 - 62 commits from branch master
    • 14774d9d - start working on overview
    • c9c5dfbb - mention of filenam_format propertie
    • 502a4c39 - first version of completed overview
    • 15a59790 - [documentation] minor updates, mostly for spelling/formatting
    • ba8ac4a2 - Update prediction.rst
    • f953e9b9 - Update prediction.rst : minor changes
    • 7466e46e - Update overview.rst - summarize the prediction part
    • 3d95d116 - [documentation] misc minor nitpicks

    Compare with previous version

  • E Madison Bray added 1 commit

    added 1 commit

    • b5bd9735 - [documentation] change potentially confusing verbiage about "datasets"

    Compare with previous version

  • E Madison Bray resolved all threads

    resolved all threads

  • E Madison Bray approved this merge request

    approved this merge request

  • E Madison Bray marked this merge request as ready

    marked this merge request as ready

  • E Madison Bray enabled an automatic merge when the pipeline for b5bd9735 succeeds

    enabled an automatic merge when the pipeline for b5bd9735 succeeds

  • E Madison Bray mentioned in commit 0475ac08

    mentioned in commit 0475ac08

  • Please register or sign in to reply
    Loading