[MRG] Added validation test for iforest on uniform data #14771

jayybhatt · 2019-08-24T17:25:42Z

Reference Issues/PRs

Fixes #7141

What does this implement/fix? Explain your changes.

Added tests to make sure that the trees predict inliers after being fitted on uniform data.

Any other comments?

amueller · 2019-08-24T17:27:49Z

ping @agramfort maybe?
Looks good to me.

jayybhatt · 2019-08-24T17:34:18Z

@agramfort

NicolasHug

minor comments but LGTM

sklearn/ensemble/tests/test_iforest.py

Co-Authored-By: Nicolas Hug <[email protected]>

agramfort

looks reasonable. thx

NicolasHug · 2019-08-26T14:32:50Z

Thanks @Jay-z007 !

matwey · 2020-04-16T11:38:01Z

Hi,

I am sorry to say so, but this commit doesn't seems to fix and test anything valuable.

Imagine the following code:

import numpy as np
from sklearn.ensemble import IsolationForest

for n_samples in range(100,104):
    X = np.ones((n_samples, 10))
    iforest = IsolationForest()
    iforest.fit(X)
    print(n_samples, np.any(iforest.predict(X) == 1))

With sklearn from master I have:

100 True
101 False
102 True
103 False

iforest.predict(X) result is based only on floating point arithmetic errors in this case.

…rn#14771)" This reverts commit bcaf381. The test in reverted commit is useless and doesn't rely on the code implementation. The commit claims to fix scikit-learn#7141, where the isolation forest is trained on the identical values leading to the degenerated trees. Under described circumstances, one may check that the exact score value for every point in the parameter space is zero (or 0.5 depending on if we are talking about initial paper or scikit-learn implementation). However, there is no special code in existing implementation, and the score value is a subject of rounding erros. So, for instance, for 100 identical input samples, we have a forest predicting everything as inliners, but for 101 input samples, we have a forest predicting everything as outliers. The decision is taken only based on floating point rounding error value. One may check this by changing the number of input samples: X = np.ones((100, 10)) to X = np.ones((101, 10)) or something else.

Added validation test for iforest on uniform data

5907f8a

jayybhatt changed the title ~~Added validation test for iforest on uniform data~~ [MRG] Added validation test for iforest on uniform data Aug 24, 2019

NicolasHug approved these changes Aug 24, 2019

View reviewed changes

sklearn/ensemble/tests/test_iforest.py Outdated Show resolved Hide resolved

sklearn/ensemble/tests/test_iforest.py Outdated Show resolved Hide resolved

jayybhatt and others added 3 commits August 24, 2019 15:40

Update sklearn/ensemble/tests/test_iforest.py

47c4f59

Co-Authored-By: Nicolas Hug <[email protected]>

Making tests deterministic

a95aebf

Removed Flake error

7f50772

agramfort approved these changes Aug 25, 2019

View reviewed changes

NicolasHug merged commit bcaf381 into scikit-learn:master Aug 26, 2019

dPys mentioned this pull request Sep 9, 2019

OneSE: the One-Standard-Error Rule #14820

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[MRG] Added validation test for iforest on uniform data #14771

[MRG] Added validation test for iforest on uniform data #14771

Uh oh!

jayybhatt commented Aug 24, 2019

Uh oh!

amueller commented Aug 24, 2019

Uh oh!

jayybhatt commented Aug 24, 2019

Uh oh!

NicolasHug left a comment

Uh oh!

Uh oh!

Uh oh!

agramfort left a comment

Uh oh!

NicolasHug commented Aug 26, 2019

Uh oh!

matwey commented Apr 16, 2020 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Uh oh!

[MRG] Added validation test for iforest on uniform data #14771

[MRG] Added validation test for iforest on uniform data #14771

Uh oh!

Conversation

jayybhatt commented Aug 24, 2019

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

amueller commented Aug 24, 2019

Uh oh!

jayybhatt commented Aug 24, 2019

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

agramfort left a comment

Choose a reason for hiding this comment

Uh oh!

NicolasHug commented Aug 26, 2019

Uh oh!

matwey commented Apr 16, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

matwey commented Apr 16, 2020 •

edited

Loading