LimeOut
The project aims to tackle process fairness for Machine Learning Models. The key idea is to use an explanation method, namely, LIME (as implemented here) to assess whether a given classifier was fair by measuring its reliance on salient or sensitive features, and that is then integrated in a human-centered workflow called LimeOut, that receives as input a triple (M,D,F) of a classifier M, a dataset D and a set F of sensitive features, and outputs a classifier Mfinal less dependent on sensitive features without compromising accuracy. To achieve both goals, LimeOut relies on feature dropout to produce a pool of classifiers that are then combined through an ensemble approach. Feature dropout receives a classifier and a feature a as input, and produces a classifier that does not take a into account.
LimeOut is based on the work presented in these papers: 1(https://arxiv.org/abs/2006.10531) and 2(https://hal.archives-ouvertes.fr/hal-02864059v5).
Experiments
We implemented this approach and we make use of a larger family of ML classifiers (that include AdaBoost, Bagging, Random Forest, and Logistic Regression). We evaluate LimeOut's output classifiers with respect to a wide variety fairness metrics on many other datasets (e.g., Adult, German Credit Score, HDMA dataset, Taiwanese Credit Card dataset, LSAC).
Dependencies
- Python >= 3.7
- Scikit-learn >= 0.23.1