Pin pandas to <1.4.0 to avoid bug writing categorical data to CSV (!151) · Merge requests · Machine learning for population genetics / private / dnadna

E Madison Bray requested to merge embray/pin-pandas into master Feb 20, 2022

For now this only affects the summary statistics module, so we can remove this restriction if we decided to reduce support for summary statistics

Ironically I just opened #131 (closed) about disabling/removing the summary statistics module, but in the meantime this should be fixed so the tests pass.

Ideally would find a workaround to the actual bug but I can hardly find any reference to this online so I'm pretty confused. When running the tests:

self = <IntervalArray>
[(0.0, 100.0], (100.0, 179.061], (179.061, 320.63], (320.63, 574.123], (574.123, 1028.033] ... (108647...8357.129, 623772.848], (623772.848, 1116935.852], (1116935.852, 2000000.0]]
Length: 18, dtype: interval[float64, right]
value = 'nan'

    def _validate_scalar(self, value):
        if isinstance(value, Interval):
            self._check_closed_matches(value, name="value")
            left, right = value.left, value.right
            # TODO: check subdtype match like _validate_setitem_value?
        elif is_valid_na_for_dtype(value, self.left.dtype):
            # GH#18295
            left = right = value
        else:
>           raise TypeError(
                "can only insert Interval objects and NA into an IntervalArray"
            )
E           TypeError: can only insert Interval objects and NA into an IntervalArray

/home/embray/.local/opt/miniconda/envs/dnadna-cpu/lib/python3.8/site-packages/pandas/core/arrays/interval.py:1102: TypeError

Admin message

Admin message

Pin pandas to <1.4.0 to avoid bug writing categorical data to CSV

Merge request reports