Thanks to visit codestin.com
Credit goes to github.com

Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion doc/whats_new/v0.22.rst
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ Changelog
:pr:`123456` by :user:`Joe Bloggs <joeongithub>`.
where 123456 is the *pull request* number, not the issue number.


:mod:`sklearn.base`
...................

Expand Down Expand Up @@ -143,12 +144,17 @@ Changelog
:pr:`14114` by :user:`Guillaume Lemaitre <glemaitre>`.

:mod:`sklearn.feature_extraction`
.......................
.................................

- |Fix| Functions created by build_preprocessor and build_analyzer of
:class:`feature_extraction.text.VectorizerMixin` can now be pickled.
:pr:`14430` by :user:`Dillon Niederhut <deniederhut>`.

- |API| Deprecated unused `copy` param for
:meth: `feature_extraction.text.TfidfVectorizer.transform` it will be
removed in v0.24. :pr:`14520` by
:user:`Guillem G. Subies <guillemgsubies>`.

:mod:`sklearn.gaussian_process`
...............................

Expand Down
12 changes: 12 additions & 0 deletions sklearn/feature_extraction/tests/test_text.py
Original file line number Diff line number Diff line change
Expand Up @@ -509,6 +509,18 @@ def test_tfidf_vectorizer_setters():
assert tv._tfidf.sublinear_tf


# FIXME Remove copy parameter support in 0.24
def test_tfidf_vectorizer_deprecationwarning():
msg = ("'copy' param is unused and has been deprecated since "
"version 0.22. Backward compatibility for 'copy' will "
"be removed in 0.24.")
with pytest.warns(DeprecationWarning, match=msg):
tv = TfidfVectorizer()
train_data = JUNK_FOOD_DOCS
tv.fit(train_data)
tv.transform(train_data, copy=True)


@fails_if_pypy
def test_hashing_vectorizer():
v = HashingVectorizer()
Expand Down
13 changes: 12 additions & 1 deletion sklearn/feature_extraction/text.py
Original file line number Diff line number Diff line change
Expand Up @@ -1729,7 +1729,7 @@ def fit_transform(self, raw_documents, y=None):
# we set copy to False
return self._tfidf.transform(X, copy=False)

def transform(self, raw_documents, copy=True):
def transform(self, raw_documents, copy="deprecated"):
"""Transform documents to document-term matrix.

Uses the vocabulary and document frequencies (df) learned by fit (or
Expand All @@ -1744,13 +1744,24 @@ def transform(self, raw_documents, copy=True):
Whether to copy X and operate on the copy or perform in-place
operations.

.. deprecated:: 0.22
The `copy` parameter is unused and was deprecated in version
0.22 and will be removed in 0.24. This parameter will be
ignored.

Returns
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a whitespace

-------
X : sparse matrix, [n_samples, n_features]
Tf-idf-weighted document-term matrix.
"""
check_is_fitted(self, '_tfidf', 'The tfidf vector is not fitted')

# FIXME Remove copy parameter support in 0.24
if copy != "deprecated":
msg = ("'copy' param is unused and has been deprecated since "
"version 0.22. Backward compatibility for 'copy' will "
"be removed in 0.24.")
warnings.warn(msg, DeprecationWarning)
X = super().transform(raw_documents)
return self._tfidf.transform(X, copy=False)

Expand Down