- 23 Nov, 2018 1 commit
-
-
Mathieu Giraud authored
should@8f711a78
-
- 16 Oct, 2018 7 commits
-
-
Thonier Florian authored
link to #3054
-
Thonier Florian authored
link to 3054
-
Thonier Florian authored
link to #3054
-
Thonier Florian authored
-
Thonier Florian authored
-
Thonier Florian authored
link to #3054
-
- 15 Oct, 2018 1 commit
-
-
Mikaël Salson authored
This will allow to more reliably add percentages that otherwise would give meaningless results. In the test we now have 497 reads out of 786861.
-
- 07 Aug, 2018 5 commits
-
-
Mathieu Giraud authored
-
Mathieu Giraud authored
-
Ryan Herbert authored
allows vidjil files to contain no clones
-
Mathieu Giraud authored
-
Mathieu Giraud authored
-
- 26 Jul, 2018 1 commit
-
-
Alexia Omietanski authored
Fixes #2929
-
- 18 Jul, 2018 25 commits
-
-
Ryan Herbert authored
So that dependencies only need to be installed if VidjilParser is used. Thanks @mikael-s :)
-
Ryan Herbert authored
-
Ryan Herbert authored
Being a fast parser allows us to gain a significant amount of time when executing.
-
Ryan Herbert authored
Restore old functionality and make new functionality optional. Due to the amount of time it takes to run the vidjilparser we may need to decide between speed and memory efficiency See #3234
-
Ryan Herbert authored
split VidjilWriter into VidjilWriter and VidjilFileWriter
-
Ryan Herbert authored
use yajl2 as the ijson parser since it is much faster. See #3234,#3235
-
Ryan Herbert authored
use a list for the buffer and join strings when needed. This should be slightly faster than concatenating strings.
-
Ryan Herbert authored
due to the error in the prefix usage, we were extracting all clones from the file instead of the top N, leading to horrendous performance issues.
-
Ryan Herbert authored
changes to vidjilparser.py allow us to simply extract the root of a json file and still apply constraints to subfields, so we no longer need to specify all the fields we need to extract. See #3234
-
Ryan Herbert authored
fixes and issue where applying a new predicate to a subfield of a prefix included in another field would not be taken into account. See #3235
-
Ryan Herbert authored
group all checks into a single call to any, to avoid multiple iterations over the same data. Also remove some comparisons that don't seem necessary.
-
Ryan Herbert authored
The introduction of the buffer caused an issue with comma delimitation when th first element of an array or map was discarded (from not meeting predicate requirements), leading to an extra comma and therefore an invalid JSON output. By saving the value of the "previous" variable when buffering begins, we can restore the variables value when discarding the buffered content. See #3235
-
Ryan Herbert authored
since we may not want to export to a file, leaving the filepath variable at None will disable the file writing capabilities.
-
Ryan Herbert authored
-
Ryan Herbert authored
-
Ryan Herbert authored
This is mostly for Python3 compatibility. The default mode seems to differ between Pyton 2 and 3, causing and issue when attempting to run fuse with Pyton 3
-
Ryan Herbert authored
The object here being some optimisations to increase the performance of fuse.py by using VidjilParser to extract only the data that is relevant to the context we are in, before loading the data into Python objects. See #3234
-
Ryan Herbert authored
it was there for testing purposes only.
-
Ryan Herbert authored
resets the parser prefixes and the writer's buffer
-
Ryan Herbert authored
makes use of buffering and predicate checking to exclude or include data to the export based on a comparison function passed when setting desired prefixes. See #3235,#2240
-
Ryan Herbert authored
-
Ryan Herbert authored
-
Ryan Herbert authored
-
Ryan Herbert authored
Adds a system where data can be put aside and is not written to the output file unless requested. Also return the extracted data string at the end of write. See #2240
-
Ryan Herbert authored
a basic version of a tool to validate and parse/extract data from large vidjil files. See #2240
-