-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
[MRG+1] add a transform_max_iter
to SparseCoder
, DictionaryLearning
, and MiniBatchDictionaryLearning
#12682
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG+1] add a transform_max_iter
to SparseCoder
, DictionaryLearning
, and MiniBatchDictionaryLearning
#12682
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can/should we test this?
The test is basically a copy/paste of the example which was giving the warning. |
I'd also need to check the docs for this, but I'd appreciate some feedback before I do that. |
doc/whats_new/v0.21.rst
Outdated
:mod:`slkearn.decomposition` | ||
............................ | ||
|
||
- |Fix| :class:`decomposition.SparseCoder` now passes `max_iter` to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now out of date. Please describe here what this change does, or at least update the PR description, as now it seems unclear, with the addition of method_max_iter
and transform_max_iter
This PR now touches quite a few public functions/classes:
I've update the whats_new entries reflecting these changes, but I have two questions:
|
transform_max_iter
to SparseCoder
, DictionaryLearning
, and MiniBatchDictionaryLearning
I don't think there's a problem with separate what's new sections, and yes,
if behaviour changes without user intervention, changed models is relevant
|
I've not yet understood this fully, but I am not very familiar with the algorithms. Why is |
I mostly followed whatever naming convention was there in each method/class. Some call the algorithm a method, and prefix the parameters related to it with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As regards the other tests, perhaps checking that changing parameter value can affect the result is an okay level of testing.
def ricker_function(resolution, center, width): | ||
"""Discrete sub-sampled Ricker (Mexican hat) wavelet""" | ||
x = np.linspace(0, resolution - 1, resolution) | ||
x = ((2 / ((np.sqrt(3 * width) * np.pi ** 1 / 4))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you must need parentheses around the 1/4... ** takes precedence. Perhaps use 0.25 instead of 1/4. You can remove some parentheses by changing * to /.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't write these. They're copy pasted from the failing example (plot_sparse_coding.py
).
But I'll try to simplify both then.
x = np.linspace(0, resolution - 1, resolution) | ||
x = ((2 / ((np.sqrt(3 * width) * np.pi ** 1 / 4))) | ||
* (1 - ((x - center) ** 2 / width ** 2)) | ||
* np.exp((-(x - center) ** 2) / (2 * width ** 2))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have extraneous brackets here making things hard to read. ** takes precedence over *
I'm not sure where the issue is coming from, but now the |
Tests are green again :) |
@jnothman happy with this now maybe? |
@jnothman should we put this in for 0.21? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The simplest test for MiniBatchDictionaryLearning
and DictionaryLearning
would be to mock out sparse_encode
, but I think it is not needed.
@@ -56,6 +56,54 @@ def test_dict_learning_overcomplete(): | |||
assert dico.components_.shape == (n_components, n_features) | |||
|
|||
|
|||
def test_max_iter(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing for convergence this way is great!
@@ -832,7 +848,8 @@ def dict_learning_online(X, n_components=2, alpha=1, n_iter=100, | |||
print('|', end=' ') | |||
code = sparse_encode(X, dictionary.T, algorithm=method, alpha=alpha, | |||
n_jobs=n_jobs, check_input=False, | |||
positive=positive_code) | |||
positive=positive_code, max_iter=method_max_iter, | |||
verbose=verbose) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may have avoided this because sparse_encode
accepts a int for verbose
and verbose
in this context is a bool. (Although this will still work)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could pass verbose > 0
, but I guess this way it's fine.
@@ -569,7 +576,8 @@ def dict_learning(X, n_components, alpha, max_iter=100, tol=1e-8, | |||
|
|||
# Update code | |||
code = sparse_encode(X, dictionary, algorithm=method, alpha=alpha, | |||
init=code, n_jobs=n_jobs, positive=positive_code) | |||
init=code, n_jobs=n_jobs, positive=positive_code, | |||
max_iter=method_max_iter, verbose=verbose) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may have avoided this because sparse_encode
accepts a int for verbose
and verbose
in this context is a bool. (Although this will still work)
@thomasjpfan I'm not sure if in your reviews you want me to change anything :P |
@@ -493,6 +494,12 @@ def dict_learning(X, n_components, alpha, max_iter=100, tol=1e-8, | |||
|
|||
.. versionadded:: 0.20 | |||
|
|||
method_max_iter : int, optional (default=1000) | |||
It is passed to the underlying ``method`` as their ``max_iter`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Maximum number of iterations to perform it..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean to add this at the beginning of the sentence? I would become Maximum number of iterations to perform it is passed to the underlying ``method`` as their ``max_iter`` parameter.
. I can't parse the sentence, confused!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is passed to sparse_encode
I was thinking of using the docstring there here:
method_max_iter : int, 1000 by default
Maximum number of iterations to perform if `algorithm='lasso_cd'`.
This way a user would not need to figure it out by parsing the code to get to sparse_code
and figure out how max_iter
is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks better now.
@@ -713,8 +714,7 @@ def dict_learning_online(X, n_components=2, alpha=1, n_iter=100, | |||
.. versionadded:: 0.20 | |||
|
|||
method_max_iter : int, optional (default=1000) | |||
It is passed to the underlying ``method`` as their ``max_iter`` | |||
parameter. | |||
Maximum number of iterations to perform in each ``sparse_encode`` step. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be consistent with dict_learning
:
Maximum number of iterations to perform.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for the extra part at the end is that dict_learning_online
has an n_iter
parameter as well, which has Maximum number of iterations to perform.
as the description. I needed to distinguish the two somehow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With two iters, we may need to describe what they actually do. Since dict_learning_online
only supports lassos
, maybe:
n_iter:
Number of mini-batch iterations to perform.
method_max_iter:
Maximum number of iterations to perform when solving the lasso problem.
Oops, I suppose I also need to change the versions in docstrings and move the whats_new |
ping @thomasjpfan :) |
Thank you @adrinjalali ! |
…, and `MiniBatchDictionaryLearning` (scikit-learn#12682)
Fixes #12650
SparseCoder now passes
max_iter
to the underlying LassoLarse whenalgorithm='lasso_lars'