Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[MRG] add sparse_threshold to make_column_transformer #12152

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 25, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 11 additions & 1 deletion sklearn/compose/_column_transformer.py
Original file line number Diff line number Diff line change
Expand Up @@ -669,6 +669,14 @@ def make_column_transformer(*transformers, **kwargs):
non-specified columns will use the ``remainder`` estimator. The
estimator must support `fit` and `transform`.

sparse_threshold : float, default = 0.3
If the transformed output consists of a mix of sparse and dense data,
it will be stacked as a sparse matrix if the density is lower than this
value. Use ``sparse_threshold=0`` to always return dense.
When the transformed output consists of all sparse or all dense data,
the stacked result will be sparse or dense, respectively, and this
keyword will be ignored.

n_jobs : int or None, optional (default=None)
Number of jobs to run in parallel.
``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.
Expand Down Expand Up @@ -705,9 +713,11 @@ def make_column_transformer(*transformers, **kwargs):
"""
n_jobs = kwargs.pop('n_jobs', None)
remainder = kwargs.pop('remainder', 'drop')
sparse_threshold = kwargs.pop('sparse_threshold', 0.3)
if kwargs:
raise TypeError('Unknown keyword arguments: "{}"'
.format(list(kwargs.keys())[0]))
transformer_list = _get_transformer_list(transformers)
return ColumnTransformer(transformer_list, n_jobs=n_jobs,
remainder=remainder)
remainder=remainder,
sparse_threshold=sparse_threshold)
4 changes: 3 additions & 1 deletion sklearn/compose/tests/test_column_transformer.py
Original file line number Diff line number Diff line change
Expand Up @@ -431,11 +431,13 @@ def test_make_column_transformer_kwargs():
scaler = StandardScaler()
norm = Normalizer()
ct = make_column_transformer(('first', scaler), (['second'], norm),
n_jobs=3, remainder='drop')
n_jobs=3, remainder='drop',
sparse_threshold=0.3)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use a different default here? Otherwise the test below is not checking that the correct value is actually passed through (as 0.3 is the default)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just pushed it -again - too late?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, don't worry (we are in a hurry a bit to get a release done).
Already opened a follow-up PR: #12156

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But thanks for the PR!

assert_equal(ct.transformers, make_column_transformer(
('first', scaler), (['second'], norm)).transformers)
assert_equal(ct.n_jobs, 3)
assert_equal(ct.remainder, 'drop')
assert_equal(ct.sparse_threshold, 0.3)
# invalid keyword parameters should raise an error message
assert_raise_message(
TypeError,
Expand Down