[MRG] Deprecate residual_metric and add support for loss in RANSAC #5497

MechCoder · 2015-10-20T20:08:08Z

Partly fixes #4740

Supply arbitrary residual metrics for 1-D targets was not possible.

from sklearn.linear_model import RANSACRegressor
from sklearn.datasets import make_regression
X, y = make_regression()
res_met = lambda dy: dy ** 2
ransac = RANSACRegressor(min_samples=5, residual_metric=res_met)
ransac.fit(X, y)
IndexError: too many indices for array

The workaround was to explicitly define res_met as accepting 2-D arrays, (since there is a reshape done) which is non-obvious.

res_met = lambda dy: np.sum(dy**2, axis=1)
ransac = RANSACRegressor(min_samples=5, residual_metric=res_met)
ransac.fit(X, y)

MechCoder · 2015-10-20T20:10:10Z

ping @amueller

On hindsight, I think something like loss_function which takes in y_true and ypred as input and returns the loss would have been better, since in that way I can use functions in sklearn.metrics directly.

MechCoder · 2015-10-20T20:28:39Z

Tests should fail because of this (https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/linear_model/tests/test_ransac.py#L296) . I find it odd that the residual metric for a 1-D array, expects a 2-D array.

I'm not sure what the cleanest way to proceed is, except for deprecating the behavior.

MechCoder · 2015-10-20T20:32:40Z

ping @ahojnnes . Would be great to have your inputs.

agramfort · 2015-10-21T12:17:26Z

test?

MechCoder · 2015-10-21T14:23:23Z

I can add a test if we decide what to do with the current behavior.

The current behavior for residual_metric accepts a 2-D array and returns a 1-D array even for mono-output y.

agramfort · 2015-10-21T14:43:44Z

no opinion ...

MechCoder · 2015-10-21T14:45:09Z

deprecate?

agramfort · 2015-10-21T14:50:04Z

who wrote this in the first place?

MechCoder · 2015-10-21T14:51:49Z

See: #2025

agramfort · 2015-10-21T14:54:39Z

@ahojnnes what's your take on this?

MechCoder · 2015-10-22T01:24:12Z

also pinging @jnothman and @arjoly as they reviewed the earlier PR

amueller · 2015-10-23T14:56:50Z

@MechCoder travis is unhappy

arjoly · 2015-10-23T15:07:58Z

sklearn/linear_model/ransac.py

@@ -177,6 +183,15 @@ def __init__(self, base_estimator=None, min_samples=None,
        self.residual_metric = residual_metric
        self.random_state = random_state

+    def _residual_metric(residual):


You don't need this.

I mean you don't need to make a method out of this.

arjoly · 2015-10-23T15:08:25Z

Can you add tests?

MechCoder · 2015-10-23T15:24:32Z

@amueller Tests are failing because I do not know what to do with the existing behavior for 1-D array.

The parameter residual_metric accepts a callable that accepts a 2-D array and returns a 1-D array.
This behavior is for both 1-D and 2-D arrays.

For example, now if I need to support arbitrary residual metrics for 1-D array, I need to do this

res_met = lambda dy: np.sum(dy**2, axis=1)
ransac = RANSACRegressor(min_samples=5, residual_metric=res_met)
ransac.fit(X, y)

which is unusual, don't you think?

amueller · 2015-10-29T22:15:51Z

@agramfort @arjoly IRL @MechCoder and I just discussed the current behavior. We thought it might be more scikit-learn style to pass a scorer instead of the residual_metric. That would allow reuse of the functions in the metrics module.
We could deprecate residual_metric and introduce scoring. That makes writing your own one slightly harder, though.
Alternatively, we could pass a score_func which is just one of the metrics functions. That would mean we can't use strings, though. Wdyt?

agramfort · 2015-10-30T10:13:23Z

I don't use this code. Can you type usage snippets of before and after refactoring?

MechCoder · 2015-10-30T18:37:29Z

@agramfort

# Previous
from sklearn.linear_regression import RANSACRegressor
res_metric = lambda dy: np.mean(np.abs(dy.reshape(-1, 1)))
ransac = RANSACRegressor(residual_metric=res_metric)
ransac,fit(X, y)

# Suggested approach 1
ransac = RANSACRegressor(scoring=mean_absolute_error)
ransac.fit(X, y)

# Suggested approach 2
scorer = make_scorer(mean_absolute_error, greater_is_better=False)
ransac = RANSACRegressor(scoring=scorer)
ransac.fit(X, y)

agramfort · 2015-10-31T09:25:03Z

thinking about it ... in terms of semantic score is used commonly for prediction evaluation while here it's for the fit. It's more a loss as specified in SGD. So in the end I would not call it a score.

MechCoder · 2015-10-31T13:51:30Z

Then, how do you suggest we provide string inputs?

agramfort · 2015-10-31T16:35:44Z

you do as in SGD. Support what makes sense for 90% people ie implement just MAE and MSE passed as full strings but also accept callables. You keep all this in ransac file. my 2c

MechCoder · 2015-11-01T14:40:26Z

sounds good to me as well.

amueller · 2015-11-02T16:44:40Z

I'm happy to use callables that take y_true and y_pred. It's just that I don't think we use this kind of function as an option anywhere else. But I don't have a strong opinion.

MechCoder · 2015-11-03T22:41:55Z

@amueller @agramfort I've made changes. Please review.

agramfort · 2015-11-04T09:12:31Z

that's fine with me. I'll let @amueller validate.

MechCoder · 2015-11-13T15:50:00Z

@amueller I can haz reviews?

ahojnnes · 2016-01-14T20:21:33Z

sklearn/linear_model/ransac.py

        else:
-            residual_metric = self.residual_metric
+            raise ValueError(
+                "loss should be 'absolute_loss', 'squared_loss' or a callable."


Missing space in the string at the end. Capitalize "Got".

ahojnnes · 2016-01-14T20:22:46Z

Left you two comments, otherwise LGTM.

MechCoder · 2016-01-14T22:44:09Z

Two +1's from @agramfort and @ahojnnes . Will merge when Travis passes

[MRG] Deprecate residual_metric and add support for loss in RANSAC

MechCoder mentioned this pull request Oct 23, 2015

[MRG+1] Raise appropriate error if y is sparse #5542

Merged

arjoly reviewed Oct 23, 2015
View reviewed changes

MechCoder force-pushed the ransac_residual branch from cf3d731 to a01d360 Compare November 3, 2015 22:38

MechCoder changed the title ~~[MRG] Supply arbitrary residual_metrics to RANSAC for 1-D targets~~ [MRG] Deprecate residual_metric and add support for loss in RANSAC Nov 3, 2015

MechCoder mentioned this pull request Nov 20, 2015

[MRG+1] ENH: Feature selection based on mutual information #5372

Closed

amueller added the Waiting for Reviewer label Dec 10, 2015

MechCoder mentioned this pull request Jan 14, 2016

[MRG+1] Added sample_weight parameter to ransac.fit #6140

Closed

ahojnnes reviewed Jan 14, 2016
View reviewed changes

MechCoder force-pushed the ransac_residual branch from a01d360 to be70e65 Compare January 14, 2016 22:42

Deprecate residual_metric and add support for loss

761b1f7

MechCoder force-pushed the ransac_residual branch from be70e65 to 761b1f7 Compare January 14, 2016 22:42

MechCoder added a commit that referenced this pull request Jan 15, 2016

Merge pull request #5497 from MechCoder/ransac_residual

ac2ff4a

[MRG] Deprecate residual_metric and add support for loss in RANSAC

MechCoder merged commit ac2ff4a into scikit-learn:master Jan 15, 2016

MechCoder deleted the ransac_residual branch January 15, 2016 03:36

Uh oh!

[MRG] Deprecate residual_metric and add support for loss in RANSAC #5497

[MRG] Deprecate residual_metric and add support for loss in RANSAC #5497

Uh oh!

Conversation

MechCoder commented Oct 20, 2015

Uh oh!

MechCoder commented Oct 20, 2015

Uh oh!

MechCoder commented Oct 20, 2015

Uh oh!

MechCoder commented Oct 20, 2015

Uh oh!

agramfort commented Oct 21, 2015

Uh oh!

MechCoder commented Oct 21, 2015

Uh oh!

agramfort commented Oct 21, 2015

Uh oh!

MechCoder commented Oct 21, 2015

Uh oh!

agramfort commented Oct 21, 2015 via email

Uh oh!

MechCoder commented Oct 21, 2015

Uh oh!

agramfort commented Oct 21, 2015 via email

Uh oh!

MechCoder commented Oct 22, 2015

Uh oh!

amueller commented Oct 23, 2015

Uh oh!

arjoly Oct 23, 2015

Choose a reason for hiding this comment

Uh oh!

arjoly Oct 23, 2015

Choose a reason for hiding this comment

Uh oh!

arjoly commented Oct 23, 2015

Uh oh!

MechCoder commented Oct 23, 2015

Uh oh!

amueller commented Oct 29, 2015

Uh oh!

agramfort commented Oct 30, 2015 via email

Uh oh!

MechCoder commented Oct 30, 2015

Uh oh!

agramfort commented Oct 31, 2015 via email

Uh oh!

MechCoder commented Oct 31, 2015

Uh oh!

agramfort commented Oct 31, 2015 via email

Uh oh!

MechCoder commented Nov 1, 2015

Uh oh!

amueller commented Nov 2, 2015

Uh oh!

MechCoder commented Nov 3, 2015

Uh oh!

agramfort commented Nov 4, 2015

Uh oh!

MechCoder commented Nov 13, 2015

Uh oh!

ahojnnes Jan 14, 2016

Choose a reason for hiding this comment

Uh oh!

ahojnnes commented Jan 14, 2016

Uh oh!

MechCoder commented Jan 14, 2016

Uh oh!

Uh oh!