Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[MRG+1] compute poly features directly #3239

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 5, 2014

Conversation

larsmans
Copy link
Member

@larsmans larsmans commented Jun 3, 2014

Here's a much simpler alternative to #3194 for fixing #3191. itertools and NumPy have all the required functionality.

The great thing about this approach is that we can turn the actual polynomials (x², y²) off and get only the interaction terms (xy) by substituting combinations for combinations_with_replacement.

@ogrisel
Copy link
Member

ogrisel commented Jun 3, 2014

Unfortunately the travis failure highlights that itertools.combinations_with_replacement needs to be backported in sklearn.utils.fixes to support Python 2.6.

@ogrisel
Copy link
Member

ogrisel commented Jun 3, 2014

The great thing about this approach is that we can turn the actual polynomials (x², y²) off and get only the interaction terms (xy) by substituting combinations for combinations_with_replacement.

Wouldn't it be interesting to add an option in the public API to let the user choose between the 2 (only interaction terms or all degree 2 terms)?

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.04%) when pulling c77e3f6 on larsmans:faster-poly-features into c3ed6c2 on scikit-learn:master.

@larsmans
Copy link
Member Author

larsmans commented Jun 4, 2014

Tests pass, except the known OMP failure on 2.6. Added a second commit for interaction features only.

@ogrisel
Copy link
Member

ogrisel commented Jun 4, 2014

Thanks! +1 for merge.

@ogrisel ogrisel changed the title [MRG] compute poly features directly [MRG+1] compute poly features directly Jun 4, 2014
@ogrisel
Copy link
Member

ogrisel commented Jun 4, 2014

I forgot to ask, have you quickly benchmarked the empirical runtime complexity of this method vs master?

@larsmans
Copy link
Member Author

larsmans commented Jun 4, 2014

Current master:

>>> %timeit PolynomialFeatures._power_matrix(3, 2, False)
1000 loops, best of 3: 244 us per loop
>>> %timeit PolynomialFeatures._power_matrix(10, 3, False)
1 loops, best of 3: 847 ms per loop
>>> PolynomialFeatures._power_matrix(30, 2, False)
# Ctrl+C after waiting 1m30

This PR:

>>> %timeit PolynomialFeatures._power_matrix(3, 2, False, False)
1000 loops, best of 3: 337 us per loop
>>> %timeit PolynomialFeatures._power_matrix(10, 3, False, False)
100 loops, best of 3: 9.73 ms per loop
>>> %timeit PolynomialFeatures._power_matrix(30, 2, False, False)
100 loops, best of 3: 17.1 ms per loop

@ogrisel
Copy link
Member

ogrisel commented Jun 4, 2014

Thanks!

@jnothman
Copy link
Member

jnothman commented Jun 5, 2014

Neat! I'm not sure about the clarity of range(not include_bias, ...) but otherwise this LGTM.

return powers[i]
comb = (combinations if interaction_only else combinations_w_r)
combn = chain(*(comb(range(n_features), i)
for i in range(not include_bias, degree + 1)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use chain.from_iterable to avoid to use a *.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

larsmans added a commit that referenced this pull request Jun 5, 2014
compute poly features directly and allow interaction features only
@larsmans larsmans merged commit 7accbfa into scikit-learn:master Jun 5, 2014
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.04%) when pulling de42d69 on larsmans:faster-poly-features into 9f468f3 on scikit-learn:master.

@larsmans
Copy link
Member Author

larsmans commented Jun 5, 2014

Merged after taking care of @jnothman and @arjoly's concerns.

@larsmans larsmans deleted the faster-poly-features branch June 6, 2014 09:58
@jnothman
Copy link
Member

jnothman commented Jun 7, 2014

@hamsal reports an issue that appears to result from this merge at
#3203 (comment)

On 6 June 2014 02:03, Lars Buitinck [email protected] wrote:

Merged after taking care of @jnothman https://github.com/jnothman and
@arjoly https://github.com/arjoly's concerns.


Reply to this email directly or view it on GitHub
#3239 (comment)
.

@jnothman jnothman mentioned this pull request Jun 7, 2014
17 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants