Alibi is an open source Python library aimed at machine learning model inspection and interpretation. The initial focus on the library is on black-box, instance based model explanations.
If you're interested in outlier detection, concept drift or adversarial instance detection, check out our sister project alibi-detect.
- Provide high quality reference implementations of black-box ML model explanation and interpretation algorithms
- Define a consistent API for interpretable ML methods
- Support multiple use cases (e.g. tabular, text and image data classification, regression)
Alibi can be installed from PyPI:
pip install alibiThis will install alibi with all its dependencies:
beautifulsoup4
numpy
Pillow
pandas
requests
scikit-learn
spacy
scikit-image
tensorflowTo run all the example notebooks, you may additionally run pip install alibi[examples] which will
install the following:
seaborn
KerasThese algorithms provide instance-specific (sometimes also called local) explanations of ML model predictions. Given a single instance and a model prediction they aim to answer the question "Why did my model make this prediction?" The following algorithms all work with black-box models meaning that the only requirement is to have acces to a prediction function (which could be an API endpoint for a model in production).
The following table summarizes the capabilities of the current algorithms:
| Explainer | Model types | Classification | Categorical data | Tabular | Text | Images | Need training set |
|---|---|---|---|---|---|---|---|
| Anchors | black-box | ✔ | ✔ | ✔ | ✔ | ✔ | For Tabular |
| CEM | black-box, TF/Keras | ✔ | ✘ | ✔ | ✘ | ✔ | Optional |
| Counterfactual Instances | black-box, TF/Keras | ✔ | ✘ | ✔ | ✘ | ✔ | No |
| Prototype Counterfactuals | black-box, TF/Keras | ✔ | ✔ | ✔ | ✘ | ✔ | Optional |
-
Anchor explanations (Ribeiro et al., 2018)
-
Contrastive Explanation Method (CEM, Dhurandhar et al., 2018)
- Documentation
- Examples: MNIST, Iris dataset
-
Counterfactual Explanations (extension of Wachter et al., 2017)
- Documentation
- Examples: MNIST
-
Counterfactual Explanations Guided by Prototypes (Van Looveren et al., 2019)
These algorihtms provide instance-specific scores measuring the model confidence for making a particular prediction.
| Algorithm | Model types | Classification | Regression | Categorical data | Tabular | Text | Images | Need training set |
|---|---|---|---|---|---|---|---|---|
| Trust Scores | black-box | ✔ | ✘ | ✘ | ✔ | ✔(1) | ✔(2) | Yes |
| Linearity Measure | black-box | ✔ | ✔ | ✘ | ✔ | ✘ | ✔ | Optional |
(1) Depending on model
(2) May require dimensionality reduction
- Trust Scores (Jiang et al., 2018)
- Documentation
- Examples: MNIST, Iris dataset
- Linearity Measure
- Examples: Iris dataset, fashion MNIST
Anchor method applied to the InceptionV3 model trained on ImageNet:
| Prediction: Persian Cat | Anchor explanation |
|---|---|
Contrastive Explanation method applied to a CNN trained on MNIST:
| Prediction: 4 | Pertinent Negative: 9 | Pertinent Positive: 4 |
|---|---|---|
Trust scores applied to a softmax classifier trained on MNIST: