Further corrections to work with current sklearn #5

vasselai · 2021-08-10T05:55:43Z

Further corrections are necessary to make 'monoensemble' work with current sklearn. The main ones that I need your attention to:

(1) the "presort" and "X_idx_sorted" sklearn parameters have been deprecated. See, respectively:
scikit-learn/scikit-learn#14907
scikit-learn/scikit-learn#16818
Since I don't know how exactly do you prefer to handle that in light of the suggestions in the first link above, in order to at least leave 'monoensemble' in a working state, the only thing I did was to comment out "presort=self.presort" from line 1540 in the file 'mono_gradient_boosting.py'. But a more definitive solution will be necessary, since right now a FutureWarning is issued every iteration due to "X_idx_sorted" deprecation (which, besides being annoying, means that the code will soon be broken again if "X_idx_sorted" is not eliminated from the code base).

(2) in the line 436, from file 'mono_forest.py', the "_generate_unsampled_indices" throws an error because that function now has an extra parameters, 'n_samples_bootstrap':
https://github.com/scikit-learn/scikit-learn/blob/4b8cd880397f279200b8faf9c75df13801cb45b7/sklearn/ensemble/_forest.py#L123
I obviously also do not know what is your preference here, but given the implementation in that link, it seems safe to assume that thus far your code was operating with the equivalent of 'n_samples_bootstrap = 1'. So that is what I imposed for now in the line 436, from file 'mono_forest.py'.

Further corrections are necessary to make 'monoensemble' work with current sklearn. The main ones that I need your attention to: (1) the "presort" and "X_idx_sorted" sklearn parameters have been deprecated. See, respectively: scikit-learn/scikit-learn#14907 scikit-learn/scikit-learn#16818 Since I don't know how exactly do you prefer to handle that in light of the suggestions in the first link above, in order to at least leave 'monoensemble' in a working state, the only thing I did was to comment out "presort=self.presort" from line 1540 in the file 'mono_gradient_boosting.py'. But a more definitive solution will be necessary, since right now a FutureWarning is issued every iteration due to "X_idx_sorted" deprecation (which, besides being annoying, means that the code will soon be broken again if "X_idx_sorted" is not eliminated from the code base). (2) in the line 436, from file 'mono_forest.py', the "_generate_unsampled_indices" throws an error because that function now has an extra parameters, 'n_samples_bootstrap': https://github.com/scikit-learn/scikit-learn/blob/4b8cd880397f279200b8faf9c75df13801cb45b7/sklearn/ensemble/_forest.py#L123 I obviously also do not know what is your preference here, but given the implementation in that link, it seems safe to assume that thus far your code was operating with the equivalent of 'n_samples_bootstrap = 1'. So that is what I imposed for now in the line 436, from file 'mono_forest.py'.

chriswbartley · 2021-08-11T02:45:43Z

Thanks Fabricio - I have just done a push to address all these deprecation issues and warnings:

I removed presort and X_idx_sorted.
I fixed _generate_unsampled_indices. Note that 'n_samples_bootstrap = 1 gives erroneous oob_scores, so i fixed that (by looking at what n_samples_bootstrap is meant to be in the sklearn code).

chriswbartley merged commit 1ae2b0b into chriswbartley:master Aug 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Further corrections to work with current sklearn #5

Further corrections to work with current sklearn #5

Uh oh!

vasselai commented Aug 10, 2021

Uh oh!

chriswbartley commented Aug 11, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Further corrections to work with current sklearn #5

Further corrections to work with current sklearn #5

Uh oh!

Conversation

vasselai commented Aug 10, 2021

Uh oh!

chriswbartley commented Aug 11, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants