0% found this document useful (0 votes)

10 views9 pages

4 Methods Overview - Interpretable Machine Learning

This chapter provides an overview of interpretability approaches in machine learning, distinguishing between interpretability by design and post-hoc interpretability. It discusses various methods, including model-agnostic and model-specific techniques, and emphasizes the importance of understanding the scope of interpretability for different models. The chapter also highlights the strengths and weaknesses of interpretable models and the challenges posed by the Rashomon effect in model selection.

Uploaded by

rkbind

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views9 pages

4 Methods Overview - Interpretable Machine Learning

Uploaded by

rkbind

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

8/22/25, 4:08 PM 4 Methods Overview – Interpretable Machine Learning

 4 Methods Overview 
4 Methods Overview
This chapter provides an overview of interpretability approaches. The goal is to give you a map so that
when you dive into the individual models and methods, you can see the forest for the trees. Figure 4.1
provides a taxonomy of the different approaches.

Figure 4.1: Short taxonomy of interpretability methods which reflects the structure of the book.

In general, we can distinguish between interpretability by design and post-hoc interpretability.

Interpretability by design means that we train inherently interpretable models, such as using logistic
regression instead of a random forest. Post-hoc interpretability means that we use an interpretability
method after the model is trained. Post-hoc interpretation methods can be model-agnostic, such as
permutation feature importance, or model-specific, such as analyzing the features learned by a neural
network. Model-agnostic methods can be further divided into local methods which focus on explaining
individual predictions, and global methods which focus on datasets. This book focuses on post-hoc
model-agnostic methods but also covers basic models that are interpretable by design and model-
specific methods for neural networks.

Let’s look at each category of interpretability and also discuss strengths and weaknesses as they relate
to your interpretation goals.

Interpretable models by design

Interpretability by design is decided on the level of the machine learning algorithm. If you want a
machine learning algorithm that produces interpretable models, the algorithm has to constrain the
search of models to those that are interpretable. The simplest example is linear regression: When you
use ordinary least squares to fit/train a linear regression model, you are using an algorithm that will
produce find models that are linear in the input features. Models that are interpretable by design are
also called intrinsically or inherently interpretable models, see Figure 4.2.

https://christophm.github.io/interpretable-ml-book/overview.html 1/9
8/22/25, 4:08 PM 4 Methods Overview – Interpretable Machine Learning

Figure 4.2: Interpretability by design means using machine learning algorithms that produce “inherently
interpretable” models.

This book covers the most basic interpretability by design approaches:

Linear regression: Fit a linear model by minimizing the sum of squared errors.
Logistic regression: Extend linear regression for classification using a nonlinear transformation.
Linear model extensions: Add penalties, interactions, and nonlinear terms for more flexibility.
Decision trees: Recursively split data to create tree-based models.
Decision rules: Extract if-then rules from data.
RuleFit: Combine tree-based rules with Lasso regression to learn sparse rule-based models.

There are many more approaches to interpretable models, ranging from extensions of these basic
approaches to very specialized approaches. Including all of them would be impossible, so I have
focused on the basic ones. Here are some examples of other interpretable-by-design approaches:

Prototype-based neural networks for image classification, called ProtoViT (Ma et al. 2024). These
neural networks are trained so that the image classification is a weighted sum of prototypes
(special images from the training data) and sub-prototypes.
Yang et al. (2024) proposed inherently interpretable tree ensembles which are boosted trees (e.g.,
with XGBoost) with adjusted hyperparameters, such as low maximum tree depth, a different
representation where feature effects are sorted into main effects and interactions, and pruning of
effects. This approach mixes both interpretability by design and post-hoc interpretability.
Model-based boosting is an additive modeling framework. The trained model is a weighted sum of
linear effects, splines, tree stumps, and other so-called weak learners (Bühlmann and Hothorn
2007).
Generalized additive models with automatic interaction detection (Caruana et al. 2015).

But how interpretable are intrinsically interpretable models? Approaches to interpretable models differ
wildly and so do their interpretation. Let’s talk about the scope of interpretability, which helps us sort
the approaches:

The model is entirely interpretable. Example: a small decision tree can be visualized and
understand easily. Or a linear regression model with not too many coefficients. “Entirely
interpretable” is a tough requirement, and again a bit fuzzy at the same time. My stance is that the
term entirely interpretable may only be used for the simplest of models such as very sparse
linear regression or very short trees, if at all.
Parts of the model are interpretable. While a regression model with hundreds of features may
not be “entirely interpretable”, we can still interpret the individual coefficients associated with the
https://christophm.github.io/interpretable-ml-book/overview.html 2/9
8/22/25, 4:08 PM 4 Methods Overview – Interpretable Machine Learning
features. Or if you have a huge decision list, you can still inspect individual rules.
The model predictions are interpretable. Some approaches allow us to interpret individual
predictions. Let’s say you would develop a k-nearest neighbor-like machine learning algorithm,
but for images. To classify an image, take the k most similar images and return the most common
class. A prediction is fully explained by showing the k similar images. Or for decision trees, a
prediction is explained by returning the decision list that led to the prediction.

Assess interpretability scope of methods

When exploring a new interpretability approach, assess the scope of interpretability. Ask at which levels
(entirely interpretable, partially interpretable, or interpretable predictions) the approach operates.

Models that are interpretable by design are usually easier to debug and improve because we get
insights into their inner workings.

Interpretability by design also shines when it comes to justifying models and outputs, as they often
faithfully explain how predictions were made. They also tend to make it easier to check with domain
experts that the models are consistent with domain knowledge. Many data-driven fields already have
established (interpretable) modeling approaches, such as logistic regression in medical research.

When it comes to discovering insights, interpretable models are a mixed bag. They make it easy to
extract insights about the models themselves. But it gets trickier when it comes to data insights
because of the need for a theoretical link between model structure and data. To interpret the model in
place of the data, you have to assume that the model structure reflects the world – something
statisticians work very hard on and need a lot of assumptions for. But what if there is a model with
better predictive performance? You would have to argue why the interpretable model represents the
data correctly, even though its predictive performance is inferior. In addition, there are often multiple
models with similar performance but different interpretations, which makes our job more difficult. This
is called the Rashomon effect. The problem with this model multiplicity is that it makes it very unclear
which model to interpret.

Rashomon

The Japanese movie Rashomon from 1950 tells four different versions of a murder story. While each version
can explain the events equally well, they are incompatible with each other. This phenomenon was named the
Rashomon effect.

Post-hoc interpretability
Post-hoc methods are applied after the model has been trained. These methods can be either model-
agnostic or model-specific:

Model-agnostic: We ignore what’s inside the model and only analyze how the model output
changes with respect to changes in the feature inputs. For example, permuting a feature and
measuring how much the model error increases.
Model-specific: We analyze parts of the model to better understand it. This can be analyzing which
types of images a neuron in a neural network responds to the most, or the Gini importance in
random forests.

https://christophm.github.io/interpretable-ml-book/overview.html 3/9
8/22/25, 4:08 PM 4 Methods Overview – Interpretable Machine Learning

Model-agnostic post-hoc methods

Model-agnostic methods work by the SIPA principle: sample from the data, perform an intervention
on the data, get the predictions for the manipulated data, and aggregate the results (Scholbeck et al.
2020). An example is permutation feature importance: We take a data sample, intervene on the data
by permuting it, get the model predictions, and compute the model error again and compare it to the
original loss (aggregation). What makes these methods model-agnostic is that they don’t need to “look
inside” the model, like reading out coefficients or weights, as visualized in Figure 4.3.

Figure 4.3: Model-agnostic interpretation methods work with inputs and outputs and ignore model internals.

Model-agnostic interpretation separates the model interpretation from the model training. Looking at
this from a higher level, the modeling process gains another layer: It starts with the world, which we
capture in the form of data, from which we learn a model. On top of that model, we have
interpretability methods for humans to consume. See Figure 4.4. For model-agnostic methods, we
have this separation, while for interpretability by design, we have model and interpretability layers
merged into one.

https://christophm.github.io/interpretable-ml-book/overview.html 4/9
8/22/25, 4:08 PM 4 Methods Overview – Interpretable Machine Learning

Figure 4.4: The big picture of (model-agnostic) interpretable machine learning. The real world goes through many
layers before it reaches the human in the form of explanations.

Separating the explanations from the machine learning model (= model-agnostic interpretation
methods) has some advantages (Ribeiro, Singh, and Guestrin 2016). The biggest strength is flexibility
in both the choice of model and the choice of interpretation method. For example, if you’re visualizing
feature effects of an XGBoost model with the partial dependence plot (PDP), you can even change the
underlying model and still use the same type of interpretation. Or, if you no longer like the PDP, you can
use accumulated local effects (ALE) without having to change the underlying XGBoost model. But if
you are using a linear regression model and interpret the coefficients, switching to a rule-based
classifier will also change the means of interpretation. Some model-agnostic methods even give you
flexibility in the feature representation used to create the explanations: For example, you can create
explanations based on image patches instead of pixels when explaining image classifier outputs.

https://christophm.github.io/interpretable-ml-book/overview.html 5/9
8/22/25, 4:08 PM 4 Methods Overview – Interpretable Machine Learning
Model-agnostic interpretation methods can be further divided into local and global methods. Local
methods aim to explain individual predictions, while global methods describe how features affect
predictions on average.

Local model-agnostic post hoc methods

Local interpretation methods explain individual predictions. Approaches in this category are quite
diverse:

Ceteris paribus plots show how changing a feature changes a prediction.

Individual conditional expectation curves show how changing one feature changes the prediction
of multiple data points.
Local surrogate models (LIME) explain a prediction by replacing the complex model with a locally
interpretable model.
Scoped rules (anchors) are rules that describe which feature values “anchor” a prediction,
meaning that no matter how many of the other features you change, the prediction remains fixed.
Counterfactual explanations explain a prediction by examining which features would need to be
changed to achieve a desired prediction.
Shapley values fairly assign the prediction to individual features.
SHAP is a computation method for Shapley values but also suggests global interpretation methods
based on combinations of Shapley values across the data.

LIME and Shapley values (and SHAP) are attribution methods that explain a data point’s prediction as
the sum of feature effects. Other methods, such as ceteris paribus and ICE, focus on individual
features and how sensitive the prediction function is to those features. Methods such as counterfactual
explanations and anchors fall somewhere in the middle, relying on a subset of the features to explain a
prediction.

For model debugging, local methods provide a “zoomed in” view that can be useful for understanding
edge cases or studying unusual predictions. For example, you can look at explanations for the
prediction with the worst prediction error and see if it’s just a difficult data point to predict, or if maybe
your model isn’t good enough, or the data point is mislabeled. Beyond that, it’s the global model-
agnostic methods that are more useful for model improvements.

When it comes to using local interpretation methods to justify individual predictions, the usefulness is
mixed: Methods such as ceteris paribus and counterfactual explanations can be very useful for
justifying model predictions because they faithfully reflect the raw model predictions. Attribution
methods like SHAP or LIME are themselves a kind of “model” (or at least more complex estimates) on
top of the model being explained and therefore may not be as suitable for high-stakes justification
purposes (Rudin 2019).

Local methods can be useful for data insights. Attribution methods such as Shapley values work with a
reference dataset and therefore allow comparing the current prediction with different subsets,
allowing different questions to be asked. In general, the usefulness of model-agnostic interpretation
for both local and global methods depends on model performance. Ceteris paribus plots and ICE are
also useful for model insights.

Global model-agnostic post-hoc methods

https://christophm.github.io/interpretable-ml-book/overview.html 6/9
8/22/25, 4:08 PM 4 Methods Overview – Interpretable Machine Learning
Global methods describe the average behavior of a machine learning model across a dataset. In this
book, you will learn about the following model-agnostic global interpretation techniques:

The partial dependence plot is a feature effect method.

Accumulated local effect plots also visualize feature effects, designed also for correlated features.
Feature interaction (H-statistic) quantifies the extent to which the prediction is the result of joint
effects of the features.
Functional decomposition is a central idea of interpretability and a technique for decomposing
prediction functions into smaller parts.
Permutation feature importance measures the importance of a feature as an increase in loss when
the feature is permuted.
Leave one feature out (LOFO) removes a feature and measures the increase in loss after retraining
the model without that feature.
Surrogate models replace the original model with a simpler model for interpretation.
Prototypes and criticisms are representative data points of a distribution and can be used to
improve interpretability.

Two broad categories within global model-agnostic methods are feature effects and feature
importance. Feature effects (PDP, ALE, H-statistic, decomposition) are about showing the relationship
between inputs and outputs. Feature importance (PFI, LOFO, SHAP importance, …) is about ranking the
features by importance, where importance is defined differently by each of the methods.

Since global interpretation methods describe average behavior, they are particularly useful when the
modeler wants to debug a model. In particular, LOFO is related to feature selection methods and is
particularly useful for model improvement.

To justify the models to stakeholders, global interpretation methods can provide some broad strokes
such as which features were relevant. You can also use global methods in combination with inherently
interpretable models. For example, while decision rule lists make it easy to justify individual
predictions, you may also want to justify the model itself by showing which features were important
overall.

Global methods are often expressed as expected values based on the distribution of the data. For
example, the partial dependence plot, a feature effect plot, is the expected prediction when all other
features are marginalized out. This is what makes these methods so useful for understanding the
general mechanisms in the data. My colleagues and I wrote papers about the PDP and PFI, and how
they can be used to infer properties about the data (Molnar et al. 2023; Freiesleben et al. 2024).

Turn global into group-wise

By applying global methods to subsets of your data, you can turn global methods into “group-wise” or
“regional” methods. We will see this in action in the examples in this book.

Model-specific post-hoc methods

As the name implies, post-hoc model-specific methods are applied after model training but only work
for specific machine learning models, as visualized in Figure 4.5. There are many such examples,
ranging from Gini importance for random forests to computing odds ratios for logistic regression. This
book focuses on post-hoc interpretation methods for neural networks.

https://christophm.github.io/interpretable-ml-book/overview.html 7/9
8/22/25, 4:08 PM 4 Methods Overview – Interpretable Machine Learning

Figure 4.5: Model-specific methods make complex models more interpretable by analyzing the models.

To make predictions with a neural network, the input data is passed through many layers of
multiplication with the learned weights and through non-linear transformations. A single prediction can
involve millions of multiplications, depending on the architecture of the neural network. There’s no
chance that we humans can follow the exact mapping from data input to prediction. We would have to
consider millions of weights interacting in complex ways to understand a neural network’s prediction.
To interpret the behavior and predictions of neural networks, we need specific interpretation methods.
Neural networks are an interesting target for interpretation because neural networks learn features
and concepts in their hidden layers. Also, we can leverage their gradients for computationally efficient
methods.

The neural network part covers the following techniques that answer different questions:

Learned Features: What features did the neural network learn?

Saliency Maps: How did each pixel contribute to a particular prediction?
Concepts: Which concepts did the neural network learn?
Adversarial Examples: How can we fool the neural network?
Influential Instances: How influential was a training data point for a given prediction?

In general, the biggest strength of model-specific methods is the ability to learn about the models
themselves. This can also help improve the model and justify it to others. When it comes to data
insights, model-specific methods have similar problems as intrinsically interpretable models: They
need a theoretical justification for why the model interpretation reflects the data.

The lines are blurred

I’ve presented different neat categories. But in reality, the lines between by design and post hoc are
blurry. Just a few examples:

Is logistic regression an intrinsically interpretable model? You have to post-process the

coefficients to interpret the odds ratios. And if you want to interpret the model effects at the level
of probabilities, you have to compute marginal effects, which can definitely be seen as a post-hoc
interpretation method (which can also be applied to other models).

https://christophm.github.io/interpretable-ml-book/overview.html 8/9
8/22/25, 4:08 PM 4 Methods Overview – Interpretable Machine Learning
Boosted tree ensembles are not considered to be interpretable. But if you set the maximum tree
depth to 1, you get boosted tree stumps, which gives you something like a generalized additive
model.
To explain a linear regression prediction, you can multiply each feature value with its coefficient.
These are then called effects. In addition, you can subtract from each effect the average effect
from the data. If you do these things, you have computed Shapley values, which are typically
considered to be model-agnostic.

The moral of the story. Interpretability is a fuzzy concept. Embrace that fuzziness, don’t get too
attached to one approach, but feel free to mix and match approaches.

Privacy Policy | Impressum

 View source Report an issue

Cookie Preferences

https://christophm.github.io/interpretable-ml-book/overview.html 9/9

An Introduction To Machine Learning Interpretability 2e
100% (1)
An Introduction To Machine Learning Interpretability 2e
62 pages
Christoph Molnar-Interpretable Machine Learning-2021
No ratings yet
Christoph Molnar-Interpretable Machine Learning-2021
368 pages
Interpretable Machine Learning
No ratings yet
Interpretable Machine Learning
185 pages
Explainable AI Introduction
No ratings yet
Explainable AI Introduction
51 pages
Algorithms For Interpretable Machine Learning
No ratings yet
Algorithms For Interpretable Machine Learning
125 pages
Unit 5 Full Notes
No ratings yet
Unit 5 Full Notes
30 pages
01 Interpretability ML
No ratings yet
01 Interpretability ML
2 pages
1 Introduction - Interpretable Machine Learning
No ratings yet
1 Introduction - Interpretable Machine Learning
3 pages
(Susol Busway) - Catalog - EN - 202103
No ratings yet
(Susol Busway) - Catalog - EN - 202103
40 pages
Interpretable Machine Learning - Definitions, Methods, and Applications PDF
No ratings yet
Interpretable Machine Learning - Definitions, Methods, and Applications PDF
11 pages
Interpretable Machine Learning PDF
100% (2)
Interpretable Machine Learning PDF
251 pages
Interpretable Machine Learning
No ratings yet
Interpretable Machine Learning
252 pages
Model Interpretability
No ratings yet
Model Interpretability
3 pages
3 Goals of Interpretability - Interpretable Machine Learning
No ratings yet
3 Goals of Interpretability - Interpretable Machine Learning
4 pages
Christoph Molnar - Interpretable Machine Learning-Lulu - Com (2020)
No ratings yet
Christoph Molnar - Interpretable Machine Learning-Lulu - Com (2020)
255 pages
ILogic and The Inventor API
No ratings yet
ILogic and The Inventor API
20 pages
Interpretable ML: Principles & Challenges
No ratings yet
Interpretable ML: Principles & Challenges
80 pages
Audi 80/90 Wiring Diagram Guide
No ratings yet
Audi 80/90 Wiring Diagram Guide
20 pages
An Introduction To Machine Learning Interpretability 1st Edition Patrick Hall and Navdeep Gill Instant Download
No ratings yet
An Introduction To Machine Learning Interpretability 1st Edition Patrick Hall and Navdeep Gill Instant Download
107 pages
2 Interpretability - Interpretable Machine Learning
No ratings yet
2 Interpretability - Interpretable Machine Learning
8 pages
Unit 25 SoW Maintenance 25
100% (1)
Unit 25 SoW Maintenance 25
8 pages
Interpretable Machine Learning
100% (4)
Interpretable Machine Learning
251 pages
A Case Study in Interpretability
No ratings yet
A Case Study in Interpretability
23 pages
Navdeep Gill, Patrick Hall - An Introduction To Machine Learning Interpretability (2018, O'Reilly Media, Inc.) PDF
No ratings yet
Navdeep Gill, Patrick Hall - An Introduction To Machine Learning Interpretability (2018, O'Reilly Media, Inc.) PDF
45 pages
Ba 1176 en (Delta) Ab增量式 (Stca900110) )
No ratings yet
Ba 1176 en (Delta) Ab增量式 (Stca900110) )
91 pages
Lecture Ch4 Performance
No ratings yet
Lecture Ch4 Performance
25 pages
Interpretable Machine Learning
No ratings yet
Interpretable Machine Learning
10 pages
Interpretability and Explainability A Ma
No ratings yet
Interpretability and Explainability A Ma
24 pages
NeurIPS 2023 Towards Automated Circuit Discovery For Mechanistic Interpretability Paper Conference
No ratings yet
NeurIPS 2023 Towards Automated Circuit Discovery For Mechanistic Interpretability Paper Conference
35 pages
Unit 4 2 DT
No ratings yet
Unit 4 2 DT
29 pages
CMPE 442 Introduction To Machine Learning: Explainable AI (XAI)
No ratings yet
CMPE 442 Introduction To Machine Learning: Explainable AI (XAI)
47 pages
Chapter 5
No ratings yet
Chapter 5
29 pages
Entropy 23 00018 v2 36
No ratings yet
Entropy 23 00018 v2 36
1 page
Entropy 23 00018 v2 2
No ratings yet
Entropy 23 00018 v2 2
1 page
Zero Leakage Performance Robust Design Trouble-Free Operation Ergonomically Designed
No ratings yet
Zero Leakage Performance Robust Design Trouble-Free Operation Ergonomically Designed
9 pages
Mythos Model Interpretability21
No ratings yet
Mythos Model Interpretability21
6 pages
Machine Learning Interpretability
No ratings yet
Machine Learning Interpretability
10 pages
Black Box Fairness Testing of Machine Learning Models
No ratings yet
Black Box Fairness Testing of Machine Learning Models
11 pages
AIML UNIT I Notes
No ratings yet
AIML UNIT I Notes
68 pages
H2 XAI Introduction
No ratings yet
H2 XAI Introduction
32 pages
MLIBooklet
No ratings yet
MLIBooklet
40 pages
MTE 2223 08 Mar 2025
No ratings yet
MTE 2223 08 Mar 2025
8 pages
An Introduction To Machine Learning Interpretability
No ratings yet
An Introduction To Machine Learning Interpretability
39 pages
ML Interpretability for Students
No ratings yet
ML Interpretability for Students
5 pages
Xai-Aaai-21 Paper 6
No ratings yet
Xai-Aaai-21 Paper 6
7 pages
Challenges Interpretability
No ratings yet
Challenges Interpretability
12 pages
Interpreting ML Models: Challenges & Opportunities
No ratings yet
Interpreting ML Models: Challenges & Opportunities
271 pages
Module1 Lecture 1
No ratings yet
Module1 Lecture 1
39 pages
ICT Skills in Literary Adaptation
No ratings yet
ICT Skills in Literary Adaptation
3 pages
Interpretable Machine Learning - Fundamental Principles and 10 Grand Challenges
No ratings yet
Interpretable Machine Learning - Fundamental Principles and 10 Grand Challenges
74 pages
1 s2.0 S0141029619311046 Main
No ratings yet
1 s2.0 S0141029619311046 Main
11 pages
21 SS133
No ratings yet
21 SS133
85 pages
Engineering Applications of Artificial Intelligence: Hajar Hakkoum, Ali Idri, Ibtissam Abnane
No ratings yet
Engineering Applications of Artificial Intelligence: Hajar Hakkoum, Ali Idri, Ibtissam Abnane
18 pages
An Introduction To Machine Learning Interpretability Second Edition PDF
No ratings yet
An Introduction To Machine Learning Interpretability Second Edition PDF
62 pages
Module1 Lecture 2
No ratings yet
Module1 Lecture 2
19 pages
Unit 5 Advanced Topics in Data Science
No ratings yet
Unit 5 Advanced Topics in Data Science
31 pages
Causal Interpretability For Machine Learning
No ratings yet
Causal Interpretability For Machine Learning
16 pages
On The Sidewalk Bleeding Essay
100% (2)
On The Sidewalk Bleeding Essay
8 pages
The Mythos of Model Interpretability
No ratings yet
The Mythos of Model Interpretability
28 pages
Review IML 2020
No ratings yet
Review IML 2020
17 pages
Mechanistic Interpretability For AI Safety A Review: Leonard Bereska Efstratios Gavves
No ratings yet
Mechanistic Interpretability For AI Safety A Review: Leonard Bereska Efstratios Gavves
41 pages
Explainable AI for Decision Makers
No ratings yet
Explainable AI for Decision Makers
34 pages
Interpretable Machine Learning - A Brief History, State-of-the-Art and Challenges
No ratings yet
Interpretable Machine Learning - A Brief History, State-of-the-Art and Challenges
15 pages
ML Interpretability for Practitioners
No ratings yet
ML Interpretability for Practitioners
41 pages
Macros For Mine Planning Engineer
No ratings yet
Macros For Mine Planning Engineer
8 pages
Interpret Ability
No ratings yet
Interpret Ability
65 pages
Machine Learning Model Interpretation Guide
No ratings yet
Machine Learning Model Interpretation Guide
78 pages
Deeplearning Ai
No ratings yet
Deeplearning Ai
71 pages
Authority To Hire 1
No ratings yet
Authority To Hire 1
4 pages
My Tasks Fiori App
No ratings yet
My Tasks Fiori App
4 pages
Interpretable Ai Not Just For Regulators
No ratings yet
Interpretable Ai Not Just For Regulators
18 pages
Entropy: Explainable AI: A Review of Machine Learning Interpretability Methods
No ratings yet
Entropy: Explainable AI: A Review of Machine Learning Interpretability Methods
45 pages
Format Kti Internasional
No ratings yet
Format Kti Internasional
3 pages
Understanding Buffer Overflow
No ratings yet
Understanding Buffer Overflow
5 pages
Python - Roshni (3)
No ratings yet
Python - Roshni (3)
5 pages
1 s2.0 S0196890421011778 Main
No ratings yet
1 s2.0 S0196890421011778 Main
12 pages
VOSviewer: Advanced Text Mining
No ratings yet
VOSviewer: Advanced Text Mining
5 pages
IDELA Training Manual - Baseline II
No ratings yet
IDELA Training Manual - Baseline II
30 pages
Lead - Security Operations and Monitoring JD
No ratings yet
Lead - Security Operations and Monitoring JD
2 pages
Communication in A Changing World Contemporary Perspectives On Business Communication Competence
No ratings yet
Communication in A Changing World Contemporary Perspectives On Business Communication Competence
12 pages
Assignment 42
No ratings yet
Assignment 42
5 pages
Lithium Battery Specs & Data
No ratings yet
Lithium Battery Specs & Data
1 page
Load Schedules For Lighting Panel Admin BLD 6 - 11-2023
No ratings yet
Load Schedules For Lighting Panel Admin BLD 6 - 11-2023
1 page
A Rapid Abnormal Event Detection Method For Surveillance Video Based On A Novel Feature in Compressed Domain of HEVC
No ratings yet
A Rapid Abnormal Event Detection Method For Surveillance Video Based On A Novel Feature in Compressed Domain of HEVC
6 pages
弗兰德减速机
No ratings yet
弗兰德减速机
5 pages

4 Methods Overview - Interpretable Machine Learning

Uploaded by

4 Methods Overview - Interpretable Machine Learning

Uploaded by

8/22/25, 4:08 PM 4 Methods Overview – Interpretable Machine Learning

In general, we can distinguish between interpretability by design and post-hoc interpretability.

Interpretable models by design

This book covers the most basic interpretability by design approaches:

Assess interpretability scope of methods

Model-agnostic post-hoc methods

Local model-agnostic post hoc methods

Ceteris paribus plots show how changing a feature changes a prediction.

Global model-agnostic post-hoc methods

The partial dependence plot is a feature effect method.

Turn global into group-wise

Model-specific post-hoc methods

Learned Features: What features did the neural network learn?

The lines are blurred

Is logistic regression an intrinsically interpretable model? You have to post-process the

Privacy Policy | Impressum

 View source Report an issue

You might also like