[refactoring] make a base class for SNPSample and make it a pluggable
This has been on my TODO list for a while and is a necessary step to more refactoring of the the configuration format
This does not go all the way to making the SNPSource completely replaceable via plugins; for the time being the "dnadna format" (hierarchy of .npz files) is still hard-coded as the only one supported during pre-processing/training
This does lay the groundwork for allowing users to easily plug in their own data source format; it will not require many more changes. but in the interest of getting version 1.0 finished I've left making this fully configurable as a later exercise
One major API change to NpzSNPSource and all other SNPSource classes is that you know longer call them to retrieve a sample, like source(0, 0); instead it just uses normal indexing brackets like source[0, 0]. I figured this was probably less confusing, and is closer to how pytorch Datasets work.
Although it's not obvious how, this is laying some groundwork for the new config formats discussed in #68 (closed).