-
POTTIER Francois authoredPOTTIER Francois authored
After the Initial Release
-
Without taking the trouble of using the fuzzer to test the code in
Emulated
, we could perhaps develop a suite ofcram
tests. -
Create an iterator directly on a short sequence and avoid the indirection in ShortPersistentSequence.
-
Can we compile in
-unsafe
mode? (Only in release mode.) Does it make a difference? -
Decide if
iter_segments
should take three separate parameters or a 3-tuple (an array segment). -
Expose a module
Segment
, with functions such asSegment.iter
? -
For a stack usage, i.e., a sequence accessed only at back side, the code can be tweaked so that there is no need to allocate the front chunk. Moreover, a few tests can be saved, e.g. in
is_empty
. The functor settings "back_access_only" should be added to control this behavior. At any time, the structure can be turned back into a normal structure by allocating a proper front chunk. We need to decide whether the front chunk would then be silently restored when needed, or whether we want to force the programmer to call an explicit conversion function to restore the front chunk of the sequence. -
Should we expose both
iter2_lax
anditer2
, or just the lax version? Exposing both is useful if we consider that the user is allowed to catchInvalid_argument
. If we consider that the user is not allowed to do this, then we can expose justiter_lax
and call ititer
. Similar question for other functions, e.g.exists2
,for_all2
.equal
isfor_all2
(strict). Addition: I think the best solution foriter2
is to define just one function, which raises a user exception (notInvalid_argument
). Unfortunately, this does not work formap2
,fold_left2
, etc. because raising an exception prevents us from returning a result. Addition bis: we could expose just one version (the lax version, which is more flexible) and let the user handle size mismatches however they wish; they can do so cheaply. -
P.filter
naturally returns an ephemeral sequence. This is more powerful than returning a persistent sequence, but it can be a pain to manually callsnapshot_and_clear
when a persistent sequence is desired. Same question for several other functions.flatten_map
is doubly-tricky. Ideally, its argumentf
should have type'a -> 'b Seq.t
. That would be more general than eithera -> 'b Ephemeral.t
ora -> 'b Persistent.t
. -
make
andfill
in PSeq could share all their schunks. -
Document the time complexity of the new operations (iterators, etc.).
-
Should all of the conversion functions (
of_array
,to_array
, etc.) take adirection
parameter? -
Restructure the code so that it compiles with OCaml < 4.08.
-
Check if specialisation of
pov
's and measures is performed as expected. Eta-expansion at the outermost level may be necessary to trigger it. -
Check if
Segment.iter
is properly inlined/specialised infind_weight_index
. -
Add a constructor
Single
to ShareableSequence. -
Document, dynamically-check, and exploit the invariant that an ephemeral sequence may or may not be owner of all of its schunks. Maybe add a field in the ephemeral sequence record to keep track of the fact that the user wishes to maintain this property, or (better?) just of the fact that this property is currently true or not. Maintaining the invariant on
concat
operations requires particular care. -
Consider other values for
max_length_of_free_list
; benchmark with LIFO and FIFO scenarios. -
Benchmark
make
andinit
versus iterated pushes, out of curiosity. Document the result. -
Benchmark: show point clouds or confidence intervals.
-
Think about reference counting. Think about transactions (take a snapshot, do some updates, then decide to revert or commit; can we keep unique ownership?). Think about backtracking scenarios. Can we implement
checkpoint
andrevert
? What dynamic checks are required in order to detect incorrect usage? -
Once odoc issue #417 is fixed, merge the branch
supply_default
. Or find a way of reformulating the code to work around the issue. (Avoidinclude
followed with shadowing.) -
What about an
approximate_split
function that splits near the middle, and is more efficient thansplit
? (Be careful to control the ratio of imbalance. Maybe sufficient to simply never a split a chunk at the outermost level.) -
Implement
take
anddrop
, i.e., specialized versions ofsplit
where only one part of the sequence is actually constructed. -
Benchmark to see if the free list is useful (in some scenarios) and how much it costs (in general). Remove it if it is useless. If we keep it, tune its maximum length.
-
Think about setting up sequential and parallel benchmarks under Multicore OCaml.
-
Benchmark to measure the benefits and costs of inner buffers.
-
Emulate
Array
. -
Emulate some
Vector
(which one?). Benchmark againstCCVector
,CCFun_vec
,BatVect
,Clarity.Vector
... See the benchmarks in ocaml-containers. -
Complete the
Stack
andQueue
wrappers withto_seq
,add_seq
,of_seq
. -
Benchmark our emulated
Stack
,Queue
,Vector
,Array
,List
,Buffer
against the originals. (Measure time and space.) -
Should we expose
update
,xchg
,get_and_update
? Maybe not necessary if we have an iterator withset
. -
Implement
rev
on ephemeral sequences. Two variants: in place, not in place. -
Think about adding a sense field to obtain O(1) reversal of sequences.
-
Document iterator invalidation. Detect it at runtime? Add a
version
field in ephemeral sequences. Use a functor parameter to enable these dynamic checks (false
by default?). Note that an attempt byf
to modify the collection whileiter f
is running should also be detected. The test can be performed just once at the end of iteration (either upon normal termination, or upon exceptional termination). -
Implement
pushn_back
to push a range of values from an external array into the sequence. Likewise,popn
to pop and blit into an external array. (We already haveof_array_segment
. Avoid excessive redundancy.) -
Allow blitting a subsequence into a subarray (
to_array_segment
). (Use the iterator to do this?) (Related to above.) -
Add blit functions between Sek objects, with a version that uses sharing and a version that does not introduce any sharing.
-
Encoding
sub
in terms ofsplit
can be expensive; propose direct implementations ofsub
. (Related to above.) -
Does the generated code change when we expose the fact that the type
segment
is equal toint
? This might allow the compiler to remove the write barrier when updating aview
field. Is performance affected? Benchmark. -
Add relaxed operations where the data structure does not represent a sequence, but a bag. In that case, all
push
operations affect the back chunk. One can maintain the invariant that all chunks in the middle sequence are full. Thebag_concat
function can be simplified and optimised. Abag_split
operation splits the bag into two parts of a specified size. -
map
andmapi
functions are missing. Add in-placemap
andmapi
for ephemeral sequences. (The nametransform
comes to mind.) -
Define
insert
,delete
, etc. (either usingsplit
andconcat
, or directly). -
Evaluate the interest of having owners on nodes from ShareableSequences. The benefit would be to update weights in place on insertion operations.
-
Report
odoc
bug where a heading in an included signature (@inline
) does not appear in the table of contents of the current page. -
Using an iterator and
init
, can implementcompact
. Or do it by popping segments off one sequence and pushing them into another. -
Think about more aggressive density invariants (e.g. fuse 3 chunks into 2, etc.).