Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 6b4e00d

Browse files
qinhanmin2014jnothman
authored andcommitted
MNT KBinsDiscretizer.transform should not mutate _encoder (#12514)
Fixes #12490
1 parent 6d9acd7 commit 6b4e00d

File tree

2 files changed

+12
-4
lines changed

2 files changed

+12
-4
lines changed

doc/whats_new/v0.20.rst

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -134,15 +134,20 @@ Changelog
134134
:mod:`sklearn.preprocessing`
135135
........................
136136

137+
- |Fix| Fixed bug in :class:`preprocessing.OrdinalEncoder` when passing
138+
manually specified categories. :issue:`12365` by `Joris Van den Bossche`_.
139+
140+
- |Fix| Fixed bug in :class:`preprocessing.KBinsDiscretizer` where the
141+
``transform`` method mutates the ``_encoder`` attribute. The ``transform``
142+
method is now thread safe. :issue:`12514` by
143+
:user:`Hanmin Qin <qinhanmin2014>`.
144+
137145
- |API| The default value of the :code:`method` argument in
138146
:func:`preprocessing.power_transform` will be changed from :code:`box-cox`
139147
to :code:`yeo-johnson` to match :class:`preprocessing.PowerTransformer`
140148
in version 0.23. A FutureWarning is raised when the default value is used.
141149
:issue:`12317` by :user:`Eric Chang <chang>`.
142150

143-
- |Fix| Fixed bug in :class:`preprocessing.OrdinalEncoder` when passing
144-
manually specified categories. :issue:`12365` by `Joris Van den Bossche`_.
145-
146151
:mod:`sklearn.utils`
147152
........................
148153

sklearn/preprocessing/_discretization.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -192,6 +192,9 @@ def fit(self, X, y=None):
192192
self._encoder = OneHotEncoder(
193193
categories=[np.arange(i) for i in self.n_bins_],
194194
sparse=self.encode == 'onehot')
195+
# Fit the OneHotEncoder with toy datasets
196+
# so that it's ready for use after the KBinsDiscretizer is fitted
197+
self._encoder.fit(np.zeros((1, len(self.n_bins_)), dtype=int))
195198

196199
return self
197200

@@ -267,7 +270,7 @@ def transform(self, X):
267270
if self.encode == 'ordinal':
268271
return Xt
269272

270-
return self._encoder.fit_transform(Xt)
273+
return self._encoder.transform(Xt)
271274

272275
def inverse_transform(self, Xt):
273276
"""Transforms discretized data back to original feature space.

0 commit comments

Comments
 (0)