Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[MRG] ChainedImputer -> IterativeImputer, and documentation update #11350

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 30 commits into from
Sep 16, 2018

Conversation

sergeyf
Copy link
Contributor

@sergeyf sergeyf commented Jun 25, 2018

Addresses two points as discussed in #11259:
(a) Removes the "average last n_imputations behavior". Now there just one parameter n_iter instead of the two: n_burn_in and n_imputations.
(b) New flag: sample_after_predict, which is False by default. If true it will sample from the predictive posterior after predicting during each round-robin iteration. Turning it on will make ChainedImputer run a single

@jnothman @glemaitre: I think ChainedImputer needs a new check: if sample_after_predict=True, then the predictor needs to have return_std as a parameter of its predict method. What's best practice for checking something like that?

Also, we may want a new example of how to use ChainedImputer for the purpose of a MICE-type analysis, and how to use ChainedImputer with a RandomForest as the predictor instead of BayesianRidge to demonstrate missForest functionality.

I think @RianneSchouten would be a good candidate for the MICE example as a contribution to this PR or a new one, and I can stick ChainedImputer + RF into plot_missing_values.py as another bar on the plot. Let me know how that sounds.

@sergeyf sergeyf mentioned this pull request Jun 25, 2018
4 tasks
@jnothman
Copy link
Member

CIs failing, FYI

@sergeyf
Copy link
Contributor Author

sergeyf commented Jun 25, 2018

Fixed, thank you!

@RianneSchouten
Copy link

RianneSchouten commented Jun 25, 2018 via email

jnothman
jnothman previously approved these changes Jun 25, 2018
Number of initial imputation rounds to perform the results of which
will not be returned.
n_iter : int, optional (default=10)
Number of imputation rounds to perform before returning the final
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably clarify that a round indicates a single imputation of each feature

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove final here.


sample_after_predict : boolean, default=False
Whether to sample from the predictive posterior of the fitted
predictor for each Imputation. Set to ``True`` if using
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming gaussian posterior. Predictor requires return_std support in predict.

It must support ``return_std`` in its ``predict`` method if
``sample_after_predict`` option is set to ``True`` below.

sample_after_predict : boolean, default=False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this name is unnecessarily verbose.

Brainstorming.

  • fill_with='expectation' vs 'sample'
  • random_impute bool
  • use_std
  • sampled
  • random_draw
  • randomized (but there are other random components)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about predict_posterior bool?

@sergeyf
Copy link
Contributor Author

sergeyf commented Jun 25, 2018

@stefvanbuuren This PR has a lot of the documentation changes you suggested. Please take a look at what's here and let me know if we're addressing your concerns. What's left to do is to make an example that actually demonstrates how to use ChainedImputer as MICE.

@sergeyf
Copy link
Contributor Author

sergeyf commented Jun 25, 2018

@jnothman I'd like to raise an error if a predictor without return_std is passed in when predict_posterior=True. The only way I've found to do this is the inspect module as follows:

predictor = BayesianRidge()
'return_std' in list(inspect.signature(predictor.predict).parameters.keys()) # True

predictor = RandomForestRegressor()
'return_std' in list(inspect.signature(predictor.predict).parameters.keys()) # False

OK to rely on the inspect module in such a fashion in the middle of impute.py?

@sergeyf sergeyf closed this Jun 25, 2018
@sergeyf sergeyf reopened this Jun 25, 2018
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sergeyf Thanks a lot for keeping up making changes based on the discussion!
Added some quick comments on the docs.

Then, the regressor is used to predict the unknown values of `y`. This is repeated
for each feature in a chained fashion, and then is done for a number of imputation
rounds. Here is an example snippet::
A more sophisticated approach is to use the :class:`ChainedImputer` class, models
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"model" -> "which models" ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, thanks.

estimate for imputation. It does so in an iterated round-robin fashion: at each step,
a feature column is designated as output `y` and the other feature columns are treated
as inputs `X`. A regressor is fit on `(X, y)` for known `y`. Then, the regressor is
used to predict the unknown values of `y`. This is repeated for each feature in a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"unknown" -> "missing"? (only a suggestion, the unknown is fine as well, but the missing might be more clear since it about imputing the missing values)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed, thanks.

of the features are Gaussian.
Basic implementation of chained mutual regressions to find replacement
values in multivariate missing data. This version assumes all features
are Gaussian.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it needed to mention this restriction of Gaussian? As I assume it depends on the predictor one uses.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far yes. The posterior sampling is done via a call to predict that also returns the standard deviation and then we sample from a Gaussian posterior. It's a limitation at the moment.

chained fashion, and then is done for a number of imputation rounds. The results
of the final imputation round are returned. Our implementation was inspired by the
R MICE package (Multivariate Imputation by Chained Equations), but differs from
it in setting single imputation to default instead of multiple imputation. This
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the "setting single imputation to default" still confusing.
As AFAIK, the ChainedImputer itself can only do single imputation, that's not a matter of its default settings. You can of course then use it multiple times (setting the appropriate arguments for this case) to do multiple imputation.

Copy link

@stefvanbuuren stefvanbuuren Jun 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right, but by now users have come to expect that "chained equations" refers to multiple imputation, so the combination "chained + single" would confuse quite a few.

@stefvanbuuren
Copy link

@sergeyf I think it is clear now that your method does single imputation by default.

Perhaps for balance also mention that what you describe as "the most common use case" differs from what is generally recommended by the statistical community (you can refer to Little & Rubin (2002, Ch 4) and Rubin (1987, Ch 1)), and only then say it's an open problem.

@sergeyf
Copy link
Contributor Author

sergeyf commented Jun 25, 2018

@stefvanbuuren this is the current full paragraph, which starts with the point about it being different in the statistics community:

In the statistics community, it is common practice to perform multiple imputations,
generating, for example, 10 separate imputations for a single feature matrix.
Each of these 10 imputations is then put through the subsequent analysis pipeline
(e.g. feature engineering, clustering, regression, classification). The 10 final
analysis results (e.g. held-out validation error) allow the data scientist to
obtain understanding of the uncertainty inherent in the missing values. The above
practice is called multiple imputation. As implemented, the :class:`ChainedImputer`
class generates a single imputation for each missing value because this
is a common use case for machine learning applications. However, it can also
be used for multiple imputations by applying it repeatedly to the same dataset with
different random seeds.

I'll add a parenthetical reference to it.

@sergeyf sergeyf changed the title [WIP] ChainedImputer update [MRG+1] ChainedImputer update Jun 25, 2018
@jnothman
Copy link
Member

I don't really like predict_posterior as the difference between the settings is not about whether prediction is from the posterior, but about whether we use the expectation or a random draw from that posterior

I don't think it is essential to raise an explicit error for the case that the estimator lacks return_std: the default TypeError will be informative enough when it tries to predict, and users who want to use the sampling variant will likely have read it's documentation which mentions return_std

@sergeyf
Copy link
Contributor Author

sergeyf commented Jun 25, 2018

OK. thanks. I'll change the name to posterior_draw?

The only thing decision left is whether to change the name to IterativeImputer.

@jnothman
Copy link
Member

jnothman commented Jun 25, 2018 via email

@sergeyf
Copy link
Contributor Author

sergeyf commented Jun 26, 2018

@glemaitre @jnothman I changed the default to be RidgeCV in case sample_posterior=False. One thing that I saw is that some of the tests started failing. It looks like the default alphas in RidgeCV are only [0.1, 1, 10]. I'm worried that's too much regularization. So, I also added a small value in case no regularization is needed:

self._predictor = RidgeCV(alphas=np.array([1e-5, 0.1,  1, 10]))

Let me know if there are any other gotchas I should be aware of in instantiating a default RidgeCV.

Thanks!

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check that sample_posterior=False is deterministic if n_nearest_features is None.

I've also realised that n_nearest_features documentation fails to mention that the features are not necessarily the nearest, but are drawn with probability proportional to correlation.

(And now that we're supporting non-linear regresssors, I wonder if we should be using a rank correlation in n_nearest_features, or max(spearman_rho, pearson_rho) or something nasty like that. This n_nearest_features is useful but might get in the way of good results with non-linear prediction.)

Thanks for all this great work. I think when we're finished here, we'll have a very powerful and useful tool.

As implemented, the :class:`IterativeImputer` class generates a single imputation
for each missing value because this is the most common use case for machine learning
applications. However, it can also be used for multiple imputations by applying it
repeatedly to the same dataset with different random seeds.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless you're using a posterior sample or perhaps a highly randomised regressor, I don't think there is quite enough randomness in IterativeImputer for it to be an appropriate method of multiple imputation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I should change this to say set sample_posterior=True.

See Chapter 4 of "Statistical Analysis with Missing Data" by Little and Rubin for
more discussion on multiple vs. single imputations.

It is still an open problem as to how useful single vs. multiple imputation is in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"when the user is not interested in measuring uncertainty due to missing values"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good addendum, thanks.

n_imputations : int, optional (default=100)
Number of chained imputation rounds to perform, the results of which
will be used in the final average.
n_iter : int, optional (default=10)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a suggestion to call this max_iter and consider early stopping in a later pr

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to leave it as-is until we actually support early stopping. The word max would be confusing as it is now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we agree that in the future we would like to do the early stopping, I would already rename it now. It's true that this will be a bit confusing, but having to rename it after it is released also doesn't seem a good option.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be one point to address in future questions about handling non-MICE cases

@@ -498,18 +505,24 @@ class ChainedImputer(BaseEstimator, TransformerMixin):

Attributes
----------
initial_imputer_ : object of class :class:`sklearn.preprocessing.Imputer`'
The imputer used to initialize the missing values.
initial_imputer_ : object of type :class:`sklearn.impute.SimpleImputer`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I was thinking we should make it possible to set initial_strategy to an imputer, to allow something non-parametric, like the SamplingImputer which may or may not appear soon.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing I'd like to defer to future Sergey =)

self._predictor = BayesianRidge()
else:
from .linear_model import RidgeCV
# including a very small alpha to approximate OLS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do wonder if we should just be using OLS. But for near-parity with the sample_posterior case, this makes sense.

# then there is no need to do burn in and the result should be
# just the initial imputation (before clipping)
if self.n_imputations < 1:
# edge case: in case the user specifies 0 for n_burn_in,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

n_iter

return X_filled

X_filled = np.clip(X_filled, self._min_value, self._max_value)
# clip only the initial filledin values
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really get why we would expect the initial imputation to require clipping.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, that's a good point. What I should probably do is this:

    if self.n_iter < 1:
        X_filled[mask_missing_values] = np.clip(X_filled[mask_missing_values],
                                                self._min_value,
                                                self._max_value)
        return X_filled

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or actually just not clip before imputing.

if self.verbose > 1:
print('[ChainedImputer] Ending imputation round '
print('[IterativeImputer] Ending imputation round '
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't covered in tests, FWIW. It should be covered to ensure there's no AttributeError or whatever

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

@@ -486,44 +486,25 @@ def test_imputation_copy():
# made, even if copy=False.


def test_chained_imputer_rank_one():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should still be valid if we run multiple imputations and take the mean, shouldn't it? Let's try leave the tests in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still is: I changed the name to test_iterative_imputer_rank_one and moved it to be with the other functional tests.

@@ -614,17 +592,17 @@ def test_chained_imputer_missing_at_transform(strategy):
initial_imputer.transform(X_test)[:, 0])


def test_chained_imputer_transform_stochasticity():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be run with and without sample_posterior=True

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

@jnothman jnothman changed the title [MRG+1] ChainedImputer update [MRG] ChainedImputer update Jun 26, 2018
@jnothman
Copy link
Member

jnothman commented Sep 3, 2018

A couple of minor merge conflicts will need resolution before we continue here. The what's new entry for ChainedImputer is now in v0.21.rst.

@sergeyf
Copy link
Contributor Author

sergeyf commented Sep 3, 2018

@jnothman This was a strange merge conflict. It looked like both of the conflicting code snippets were something to add, so I did that. Please take a look and see if you agree. What are next steps here?

@jnothman
Copy link
Member

jnothman commented Sep 4, 2018

I think you did an unfinished rename of a variable:

flake8 failures

examples/plot_missing_values.py:81:5: F841 local variable 'chained_impute_scores' is assigned to but never used
    chained_impute_scores = cross_val_score(estimator, X_missing, y_missing,
    ^
examples/plot_missing_values.py:87:14: F821 undefined name 'iterative_impute_scores'
            (iterative_impute_scores.mean(), iterative_impute_scores.std()))
             ^
examples/plot_missing_values.py:87:46: F821 undefined name 'iterative_impute_scores'
            (iterative_impute_scores.mean(), iterative_impute_scores.std()))
                                             ^

Error in examples:


Unexpected failing examples:
/home/circleci/project/examples/plot_missing_values.py failed leaving traceback:
Traceback (most recent call last):
  File "/home/circleci/project/examples/plot_missing_values.py", line 90, in <module>
    results_diabetes = np.array(get_results(load_diabetes()))
  File "/home/circleci/project/examples/plot_missing_values.py", line 87, in get_results
    (iterative_impute_scores.mean(), iterative_impute_scores.std()))
NameError: name 'iterative_impute_scores' is not defined

@sergeyf
Copy link
Contributor Author

sergeyf commented Sep 4, 2018

Thanks @jnothman. This is kind of weird: I remember all tests passing just fine before all the unmerging started happening.

@sergeyf
Copy link
Contributor Author

sergeyf commented Sep 4, 2018

@glemaitre it still says "requested changes" from you. Do you remember what these were?

@glemaitre
Copy link
Member

I would need to review once more since that a lot of changes have been done. For sure my previous comments have been addressed in some way.

@sergeyf
Copy link
Contributor Author

sergeyf commented Sep 4, 2018 via email

@sergeyf
Copy link
Contributor Author

sergeyf commented Sep 12, 2018

@jnothman any eta for further reviews? I'd like to get this merged and copy the code into fancyimpute while this is going through the approval process here.

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments. I'd like to merge this soon, but if you could please pull together a list of remaining questions/issues/goals, @sergeyf, that would be great

n_imputations : int, optional (default=100)
Number of chained imputation rounds to perform, the results of which
will be used in the final average.
n_iter : int, optional (default=10)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be one point to address in future questions about handling non-MICE cases

If ``sample_posterior`` is True, the predictor must support
``return_std`` in its ``predict`` method. Also, if
``sample_posterior=True`` the default predictor will be
``BayesianRidge()`` and ``RidgeCV`` otherwise.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use () in both cases or neither

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think use a :class: reference in both

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Whether to sample from the (Gaussian) predictive posterior of the
fitted predictor for each imputation. Predictor must support
``return_std`` in its ``predict`` method if set to ``True``. Set to
``True`` if using ``IterativeImputer`` for multiple imputations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although a small random forest with changing seeds might also be a good way to do multiple imputation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify: this statement is in response to me specify "Gaussian"? If so, yes, but we need a return_std for the random forest and I'm not sure what the right way to do this is yet. I think a good way to sample the posterior of the random forest is to pick a prediction of ONE of the n_trees uniformly at random. What we actually care about is sampling from the posterior, not the return_std.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how to pick one of the trees uniformly at random in a sensible way. But the API could be extended to support different ways of sampling the posterior via a non-binary switch here, i.e. sample_posterior='std' or something

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I can investigate if we agree it actually makes sense. I am not even sure how to tell if it does.

additional fitting. We do this by storing each feature's predictor during
the round-robin ``fit`` phase, and predicting without refitting (in order)
during the ``transform`` phase.
This implementation was inspired by the R MICE package (Multivariate
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be dropped here, and stay in the user guide

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Imputation by Chained Equations), but differs from it by returning a single
imputation instead of multiple imputations. However, multiple imputation
can be achieved with multiple instances of the imputer with different
random seeds run in parallel.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would need the regressor to use different random seeds, which we don't easily support atm. Something to think about

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. As long as we can sample from the posterior predictive distribution of the regressor within IterativeImputer (with different random seeds), it's OK not to have different random seeds inside the regressor. Many regressors are deterministic, right?

can be achieved with multiple instances of the imputer with different
random seeds run in parallel.

To support imputation in inductive mode we store each feature's predictor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like, again, this might be User Guide material. But ambivalent

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to keep as much verbosity as possible, so I'll take your ambivalence as acquiescence to keep it here =)

@sergeyf
Copy link
Contributor Author

sergeyf commented Sep 14, 2018

@jnothman thanks for the review.

In terms of outstanding issues, I'm not that sure actually.

I think there was a concern that n_iter should be called max_iter if we ever use early stopping.

We also still need a solid multiple imputation example.

Gael asked for "examples that convey the compromises and help users (and library developers) to make the right choices." I think he meant "what are sane defaults?"

We probably need to do a survey of the packages in R, figure out which ones have been battle tested and are well-respected, and see if IterativeImputer can do what they do as it currently exists.

Am I missing something obvious?

@sergeyf
Copy link
Contributor Author

sergeyf commented Sep 14, 2018

@jnothman lots of weird test errors: SyntaxError: invalid escape sequence \_ I am almost sure this isn't my fault!

@sergeyf sergeyf closed this Sep 14, 2018
@sergeyf sergeyf reopened this Sep 14, 2018
@jnothman
Copy link
Member

I've merged in master where that CI issue was fixed.

@jnothman
Copy link
Member

Yes, I think you're right to make further changes driven by examples: we want an example that illustrates something along the lines of missforest functionality, and another along the lines of MICE, and use that to identify appropriate. default behaviour.

@@ -462,7 +462,8 @@ class IterativeImputer(BaseEstimator, TransformerMixin):
If ``sample_posterior`` is True, the predictor must support
``return_std`` in its ``predict`` method. Also, if
``sample_posterior=True`` the default predictor will be
``BayesianRidge()`` and ``RidgeCV`` otherwise.
``:class:sklearn.linear_model:BayesianRidge()`` and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the right syntax. Git grep :class: for examples

@sergeyf
Copy link
Contributor Author

sergeyf commented Sep 15, 2018

@jnothman I'm concerned that a few examples won't tell us about sensible defaults. What we actually need is a large-enough set of experiments where we sweep meaningful parameters to get a sense of reasonable defaults. What's big enough? How has sklearn previous decided on defaults if it's unclear?

@jnothman
Copy link
Member

jnothman commented Sep 16, 2018 via email

@sergeyf
Copy link
Contributor Author

sergeyf commented Sep 16, 2018

Ok makes sense. All tests pass now. I'll wait for this to merge and then build a missForest example on top of what we currently have. Let me know if there's anything else for this PR.

@jnothman jnothman merged commit a4f2a89 into scikit-learn:iterativeimputer Sep 16, 2018
@jnothman
Copy link
Member

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants