Add ArrayQueue as Buffer (!73) · Merge requests · melissa / Melissa

Lucas Meyer requested to merge array-buffer into develop Jan 06, 2023

This MR introduces a buffer whose container is an array. This MR is a placeholder as was introduced in https://gitlab.inria.fr/melissa/melissa-combined/-/merge_requests/66 which will be updated with the latest changes of develop.

The ArrayQueue adds two novelty:

container is a numpy.ndarray;
data are served batched already

The current generation of batches is made through iterator that calls the buffer get method. If data are not evicted on reading, the get method may return twice the same data. Thus, leading to batch having redundant data. Getting batched data directly avoids this situation.

The fact that data are batched requires modification of the dataloader in the server.py:

dataloader = torch.utils.data.DataLoader(self.dataset, batch_size=None, num_workers=0)

The batch_size=None enforces that data are not collated, hence not adding a batch dimension as data already come as batches.

Additionally, a user may want to store data in more than one array. For instance, storing positions and field values which may not have the same dimension. This requires to overwrite the default ArrayQueue:

Here an example for the MP-PDE solver data received from the Burger equation with varying parameters:

class Buffer(ArrayQueue):
    def _init(self, maxsize):
        self.u_super = np.empty((maxsize, 250, 100), dtype=np.float32)
        self.x = np.empty((maxsize, 100), dtype=np.float32)
        self.variables = np.empty((maxsize, 3), dtype=np.float32)

    def _put(self, item):
        free_spot_index = int(random.choice(np.flatnonzero(self.set == 0)))
        u_super, x, variables = item
        self.u_super[free_spot_index] = u_super
        self.x[free_spot_index] = x
        self.variables[free_spot_index] = variables
        self.seen[free_spot_index] = 0
        self.set[free_spot_index] = 1
        self.population += 1

    def _get(self):
        get_indices = random.sample(np.flatnonzero(self.set).tolist(), k=self.get_size)
        u_super = self.u_super[get_indices]
        x = self.x[get_indices]
        variables = self.variables[get_indices]
        self.u_super[get_indices] = np.empty_like(u_super)
        self.x[get_indices] = np.empty_like(x)
        self.variables[get_indices] = np.empty_like(variables)
        self.seen[get_indices] = 0
        self.set[get_indices] = 0
        self.population -= self.get_size

        u_super = torch.from_numpy(u_super)
        x = torch.from_numpy(x)
        variables = torch.from_numpy(variables)
        variables = {key: val for key, val in zip(["alpha", "beta", "gamma"], variables.T)}
        return u_super, x, variables

Edited Jan 06, 2023 by Lucas Meyer

Admin message

Add ArrayQueue as Buffer

Merge request reports