Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[MRG+1] add a transform_max_iter to SparseCoder, DictionaryLearning, and MiniBatchDictionaryLearning #12682

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 29 commits into from
Jul 8, 2019

Conversation

adrinjalali
Copy link
Member

@adrinjalali adrinjalali commented Nov 27, 2018

Fixes #12650

SparseCoder now passes max_iter to the underlying LassoLarse when algorithm='lasso_lars'

@adrinjalali adrinjalali changed the title SparseCoder passes max_iter to LassoLarse [MRG+1] SparseCoder passes max_iter to LassoLarse Nov 28, 2018
Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can/should we test this?

@adrinjalali adrinjalali changed the title [MRG+1] SparseCoder passes max_iter to LassoLarse [WIP] SparseCoder passes max_iter to LassoLarse Nov 28, 2018
@adrinjalali
Copy link
Member Author

The test is basically a copy/paste of the example which was giving the warning.

@adrinjalali adrinjalali changed the title [WIP] SparseCoder passes max_iter to LassoLarse [MRG] SparseCoder passes max_iter to LassoLarse Nov 28, 2018
@adrinjalali
Copy link
Member Author

I'd also need to check the docs for this, but I'd appreciate some feedback before I do that.

@adrinjalali adrinjalali changed the title [MRG] SparseCoder passes max_iter to LassoLarse [MRG] ENH SparseCoder passes max_iter to LassoLarse Nov 29, 2018
@adrinjalali adrinjalali changed the title [MRG] ENH SparseCoder passes max_iter to LassoLarse ENH SparseCoder passes max_iter to LassoLarse Nov 29, 2018
@jnothman jnothman changed the title ENH SparseCoder passes max_iter to LassoLarse ENH SparseCoder passes max_iter to LassoLars Dec 3, 2018
:mod:`slkearn.decomposition`
............................

- |Fix| :class:`decomposition.SparseCoder` now passes `max_iter` to the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now out of date. Please describe here what this change does, or at least update the PR description, as now it seems unclear, with the addition of method_max_iter and transform_max_iter

@adrinjalali
Copy link
Member Author

This PR now touches quite a few public functions/classes:

  • sparse_encode() now passes the max_iter to the underlying LassoLarse when algorithm='lasso_lars' (BUG)
  • dict_learning() and dict_learning_online() now accept method_max_iter and pass it to sparse_encode (ENH)
  • SparseCoder, DictionaryLearning, and MiniBatchDictionaryLearning now take a transform_max_iter parameter and pass it to either dict_learning() or sparse_encode() (ENH)

I've update the whats_new entries reflecting these changes, but I have two questions:

  1. those are three separate entries in the log, I'm not sure how to merge them, or if I should. Should I also include the models in the Changed Models section?
  2. I've added a test which checks for the warning I was encountering in the example (ConvergenceWarning), but there's no new explicit test for the 5 other public functions/classes which have a change in this PR. I'm not sure how to best test them though!

@adrinjalali adrinjalali changed the title ENH SparseCoder passes max_iter to LassoLars ENH add a transform_max_iter to SparseCoder, DictionaryLearning, and MiniBatchDictionaryLearning Dec 3, 2018
@jnothman
Copy link
Member

jnothman commented Dec 6, 2018 via email

@jnothman
Copy link
Member

I've not yet understood this fully, but I am not very familiar with the algorithms. Why is method_max_iter different from transform_max_iter?

@adrinjalali
Copy link
Member Author

I mostly followed whatever naming convention was there in each method/class. Some call the algorithm a method, and prefix the parameters related to it with method_, and some call it transform_algorithm, and use transform_ prefix for the algorithm's parameters.

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As regards the other tests, perhaps checking that changing parameter value can affect the result is an okay level of testing.

def ricker_function(resolution, center, width):
"""Discrete sub-sampled Ricker (Mexican hat) wavelet"""
x = np.linspace(0, resolution - 1, resolution)
x = ((2 / ((np.sqrt(3 * width) * np.pi ** 1 / 4)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you must need parentheses around the 1/4... ** takes precedence. Perhaps use 0.25 instead of 1/4. You can remove some parentheses by changing * to /.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't write these. They're copy pasted from the failing example (plot_sparse_coding.py).

But I'll try to simplify both then.

x = np.linspace(0, resolution - 1, resolution)
x = ((2 / ((np.sqrt(3 * width) * np.pi ** 1 / 4)))
* (1 - ((x - center) ** 2 / width ** 2))
* np.exp((-(x - center) ** 2) / (2 * width ** 2)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have extraneous brackets here making things hard to read. ** takes precedence over *

@adrinjalali
Copy link
Member Author

I'm not sure where the issue is coming from, but now the examples/decomposition/plot_sparse_coding.py doesn't converge with lasso_cd method, no matter what the max_iter. Any easy answers?

@adrinjalali
Copy link
Member Author

Tests are green again :)

@adrinjalali
Copy link
Member Author

@jnothman happy with this now maybe?

@adrinjalali
Copy link
Member Author

@jnothman should we put this in for 0.21?

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The simplest test for MiniBatchDictionaryLearning and DictionaryLearning would be to mock out sparse_encode, but I think it is not needed.

@@ -56,6 +56,54 @@ def test_dict_learning_overcomplete():
assert dico.components_.shape == (n_components, n_features)


def test_max_iter():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing for convergence this way is great!

@@ -832,7 +848,8 @@ def dict_learning_online(X, n_components=2, alpha=1, n_iter=100,
print('|', end=' ')
code = sparse_encode(X, dictionary.T, algorithm=method, alpha=alpha,
n_jobs=n_jobs, check_input=False,
positive=positive_code)
positive=positive_code, max_iter=method_max_iter,
verbose=verbose)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may have avoided this because sparse_encode accepts a int for verbose and verbose in this context is a bool. (Although this will still work)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could pass verbose > 0, but I guess this way it's fine.

@@ -569,7 +576,8 @@ def dict_learning(X, n_components, alpha, max_iter=100, tol=1e-8,

# Update code
code = sparse_encode(X, dictionary, algorithm=method, alpha=alpha,
init=code, n_jobs=n_jobs, positive=positive_code)
init=code, n_jobs=n_jobs, positive=positive_code,
max_iter=method_max_iter, verbose=verbose)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may have avoided this because sparse_encode accepts a int for verbose and verbose in this context is a bool. (Although this will still work)

@adrinjalali
Copy link
Member Author

@thomasjpfan I'm not sure if in your reviews you want me to change anything :P

@@ -493,6 +494,12 @@ def dict_learning(X, n_components, alpha, max_iter=100, tol=1e-8,

.. versionadded:: 0.20

method_max_iter : int, optional (default=1000)
It is passed to the underlying ``method`` as their ``max_iter``
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Maximum number of iterations to perform it..."

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean to add this at the beginning of the sentence? I would become Maximum number of iterations to perform it is passed to the underlying ``method`` as their ``max_iter`` parameter.. I can't parse the sentence, confused!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is passed to sparse_encode I was thinking of using the docstring there here:

method_max_iter : int, 1000 by default
    Maximum number of iterations to perform if `algorithm='lasso_cd'`.

This way a user would not need to figure it out by parsing the code to get to sparse_code and figure out how max_iter is used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks better now.

@@ -713,8 +714,7 @@ def dict_learning_online(X, n_components=2, alpha=1, n_iter=100,
.. versionadded:: 0.20

method_max_iter : int, optional (default=1000)
It is passed to the underlying ``method`` as their ``max_iter``
parameter.
Maximum number of iterations to perform in each ``sparse_encode`` step.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be consistent with dict_learning:

Maximum number of iterations to perform.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for the extra part at the end is that dict_learning_online has an n_iter parameter as well, which has Maximum number of iterations to perform. as the description. I needed to distinguish the two somehow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With two iters, we may need to describe what they actually do. Since dict_learning_online only supports lassos, maybe:

n_iter:
    Number of mini-batch iterations to perform.

method_max_iter:
    Maximum number of iterations to perform when solving the lasso problem.

@adrinjalali
Copy link
Member Author

Oops, I suppose I also need to change the versions in docstrings and move the whats_new

@adrinjalali
Copy link
Member Author

ping @thomasjpfan :)

@thomasjpfan thomasjpfan merged commit 7b8cbc8 into scikit-learn:master Jul 8, 2019
@thomasjpfan
Copy link
Member

Thank you @adrinjalali !

koenvandevelde pushed a commit to koenvandevelde/scikit-learn that referenced this pull request Jul 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SparseCoder doesn't expose max_iter for Lasso
5 participants