Implement 'TorchDataset' to support an example on language models (!47) · Merge requests · Magnet / DecLearn / declearn2

BIGAUD Nathan requested to merge 24-declearn-text into develop May 23, 2023

This MR was initially about implementing 'declearn-text', a selection of tools to perform some NLP using declearn.

As things advanced, a number of things proved doable by merely interfacing third-party tools with the existing declearn ones. A distinct repo (which is due to become public) was created to hold a given use case experiment, while the only substantial modification to declearn was kept to be performed through this MR: adding an interface for 'torch.data.Dataset' (as previously called for in issue #21 (closed)).

As part of that effort, the Dataset' ABC was revised to remove the previously-required dataset saving/loading methods from the API (while retaining the existing implementation for the InMemoryDatasetrealization). Unit tests were also added, that coverInMemoryDatasetand comprise a common test suite in addition to covering the newTorchDataset` features.

Closes #24 (closed)

Edited Jul 26, 2023 by ANDREY Paul

Admin message

Implement 'TorchDataset' to support an example on language models

Merge request reports