Mentions légales du service

Skip to content

Update Buffer

Lucas Meyer requested to merge memory_buffer into develop

The present MR does:

  • Introduce a new buffer that evicts on write, and sample randomly.
  • Factorize buffer classes with Mixin.
  • Add tests to check different buffer behaviour and their performances.
  • Add typing to make Mixin classes comply in part with Mypy.

Some major changes from previous buffer implementation:

  • Copy the queue.Queue methods necessary for the buffer to work, but does not inherit from it directly, because there are additional features we don't want like join, task_done.

  • Replace the queue container from collections.deque to a simple list. Motivation is that we want to perform random access memory, and documentation states:

    Indexed access is O(1) at both ends but slows to O(n) in the middle. For fast random access, use lists instead.

  • Use Mixin to separate components and features, based on a previous suggestion of @rcaulk. Now, I'm not sure my implementation is the most readable. But we can have batch version, threshold, for virtually any queue we want. We can test those components independently. I added tests to be sure the actual behavior is the one expected.

  • The test_buffer.py also allows to check the performances of different buffer implementations. The experiment generate 500,000 samples with multi-threading. The buffer size is 1,000 and the threshold 200. Batch size is set to 16.

    Buffer # Sample Seen Time (s)
    ThresholdQueue 500,000 12
    ThresholdReservoirQueue 1,213,000 85
    BatchThresholdEvictOnWriteQueue 2,848,088 17

    The experiment also shows that we don't have to call explicitly torch.from_numpy as it is done automatically in the default collate_fn.

Edited by Lucas Meyer

Merge request reports