Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@haoranShu
Copy link
Contributor

Reference Issues

Fixes #10336

What does this implement/fix? Explain your changes.

Added fit_predict method to all Gaussian mixture models and added tests to Bayesian Gaussian Mixture and Gaussian Mixture.

Any other comments?

fit is changed to call fit-predict, which really does the computation. In this way we can use log_resp conveniently for predict.

times until the change of likelihood or lower bound is less than
`tol`, otherwise, a `ConvergenceWarning` is raised.
`tol`, otherwise, a `ConvergenceWarning` is raised. After fitting, it
predicts the most probable label for the input data points.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this docstring seems not correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I'm changing it back to original.

"""Estimate model parameters with the EM algorithm.
The method fit the model `n_init` times and set the parameters with
The method first fit the model `n_init` times and set the parameters with
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should probably be "fits", right?

assert_array_equal(Y_pred1, Y_pred2)


def test_bayesian_mixture_predict_predict_proba():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this was copied from the other test, can you maybe say so?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get how this relates to the issue... What am I missing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry just saw this! This was copied from other test. I will comment on that. We added this test because to test fit_predict, we intended to test two things: 1. it is equivalent to fit().predict(); 2. it's output is correct. There was no testing for correctness of predict() for bgmm, we added one so that we know fit_predict actually yields the correct output.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, great!

@amueller
Copy link
Member

looks good. Is that jet in your avatar? ;)

@amueller amueller changed the title #10336 adding fit_predict to mixture models [MRG + 1] #10336 adding fit_predict to mixture models Jun 15, 2018
Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM.

It looks like those tests could be refactored though

assert_array_equal(Y_pred1, Y_pred2)


def test_bayesian_mixture_predict_predict_proba():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get how this relates to the issue... What am I missing?

X = rand_data.X[covar_type]
Y = rand_data.Y
g = GaussianMixture(n_components=rand_data.n_components,
random_state=rng, weights_init=rand_data.weights,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be passing random_state=0 rather than passing an object which will be changed with each iteration

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it doesn't matter because the only thing we need is that the GMM is not changed within one iteration. Using different random_state with different COVARIANCE_TYPE actually tests more robustness? I guess?

@jnothman
Copy link
Member

You have flake8 failures

@haoranShu
Copy link
Contributor Author

Just fixed some flake8 problems. There is one left at line 456 of file tests/test_bayesian_mixture.py which I do not know how to solve.

Y = rand_data.Y
bgmm = BayesianGaussianMixture(n_components=rand_data.n_components,
random_state=rng,
weight_concentration_prior_type=prior_type,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could either leave this flake8 issue unsolved, or do:

            bgmm = BayesianGaussianMixture(
                n_components=rand_data.n_components,
                random_state=rng,
                weight_concentration_prior_type=prior_type,
                covariance_type=covar_type)

assert_array_equal(Y_pred1, Y_pred2)


def test_bayesian_mixture_predict_predict_proba():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, great!

@jnothman
Copy link
Member

Please add an entry to the change log at doc/whats_new/v0.20.rst. Like the other entries there, please reference this pull request with :issue: and credit yourself (and other contributors if applicable) with :user:

I'm not sure if it's better listed under API changes or as an enhancement.

times until the change of likelihood or lower bound is less than
`tol`, otherwise, a `ConvergenceWarning` is raised. After fitting, it
predicts the most probable label for the input data points.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this deserves .. versionadded:: 0.20

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I missed an email and just saw this! Will do soon.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am putting it under API changes, because there seems to be no enhancement in terms of efficiency or accuracy.

舒浩然 and others added 2 commits July 2, 2018 20:37
@jnothman jnothman merged commit c303ed8 into scikit-learn:master Jul 3, 2018
@jnothman
Copy link
Member

jnothman commented Jul 3, 2018

Thanks @haoranShu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Should mixture models have a clusterer-compatible interface

4 participants