Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 4501959

Browse files
DOC adding PDP for categorical features in highlights (#25065)
Co-authored-by: Jérémie du Boisberranger <[email protected]>
1 parent f2f3b3c commit 4501959

File tree

2 files changed

+37
-3
lines changed

2 files changed

+37
-3
lines changed

doc/whats_new/v1.2.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -411,7 +411,7 @@ Changelog
411411
:mod:`sklearn.inspection`
412412
.........................
413413

414-
- |Enhancement| Extended :func:`inspection.partial_dependence` and
414+
- |MajorFeature| Extended :func:`inspection.partial_dependence` and
415415
:class:`inspection.PartialDependenceDisplay` to handle categorical features.
416416
:pr:`18298` by :user:`Madhura Jayaratne <madhuracj>` and
417417
:user:`Guillaume Lemaitre <glemaitre>`.

examples/release_highlights/plot_release_highlights_1_2_0.py

Lines changed: 36 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -93,15 +93,49 @@
9393
hist_no_interact, X, y, cv=5, n_jobs=2, train_sizes=np.linspace(0.1, 1, 5)
9494
)
9595

96+
# %%
97+
# :class:`~inspection.PartialDependenceDisplay` exposes a new parameter
98+
# `categorical_features` to display partial dependence for categorical features
99+
# using bar plots and heatmaps.
100+
from sklearn.datasets import fetch_openml
101+
102+
X, y = fetch_openml(
103+
"titanic", version=1, as_frame=True, return_X_y=True, parser="pandas"
104+
)
105+
X = X.select_dtypes(["number", "category"]).drop(columns=["body"])
106+
107+
# %%
108+
from sklearn.preprocessing import OrdinalEncoder
109+
from sklearn.pipeline import make_pipeline
110+
111+
categorical_features = ["pclass", "sex", "embarked"]
112+
model = make_pipeline(
113+
ColumnTransformer(
114+
transformers=[("cat", OrdinalEncoder(), categorical_features)],
115+
remainder="passthrough",
116+
),
117+
HistGradientBoostingRegressor(random_state=0),
118+
).fit(X, y)
119+
120+
# %%
121+
from sklearn.inspection import PartialDependenceDisplay
122+
123+
fig, ax = plt.subplots(figsize=(14, 4), constrained_layout=True)
124+
_ = PartialDependenceDisplay.from_estimator(
125+
model,
126+
X,
127+
features=["age", "sex", ("pclass", "sex")],
128+
categorical_features=categorical_features,
129+
ax=ax,
130+
)
131+
96132
# %%
97133
# Faster parser in :func:`~datasets.fetch_openml`
98134
# -----------------------------------------------
99135
# :func:`~datasets.fetch_openml` now supports a new `"pandas"` parser that is
100136
# more memory and CPU efficient. In v1.4, the default will change to
101137
# `parser="auto"` which will automatically use the `"pandas"` parser for dense
102138
# data and `"liac-arff"` for sparse data.
103-
from sklearn.datasets import fetch_openml
104-
105139
X, y = fetch_openml(
106140
"titanic", version=1, as_frame=True, return_X_y=True, parser="pandas"
107141
)

0 commit comments

Comments
 (0)