Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

snath-xoc
Copy link
Contributor

@snath-xoc snath-xoc commented Aug 29, 2025

Reference Issues/PRs

Related to issue #31885

What does this implement/fix? Explain your changes.

this PR deprecates the use of probability=True in both SVC and NuSVC

Any other comments?

For now the default is set as probability="deprecated" and later on it is set to probability=False. This is as within the base libsvm.fit function pyx code, probability is required to be a boolean integer. Perhaps in future we hard-code this as always being 0 (i.e., False), as we should not use probability=True?

FYI @ogrisel

Copy link

github-actions bot commented Aug 29, 2025

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: bb61c1a. Link to the linter CI: here

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @snath-xoc for the PR. Here is a pass of feedback. Once addressed, could you please try to fix any issue reported by the CI is green before pinging reviewers again?

Feel free to ask for help if there are particular CI failures you don't know how to address.

Comment on lines 1105 to 1108
# XXX: this test is thread-unsafe because it uses probability=True:
# https://github.com/scikit-learn/scikit-learn/issues/31885
@pytest.mark.thread_unsafe
@pytest.mark.filterwarnings("ignore::FutureWarning")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could change this test to remove probability=True along with all the test function annotations: it seems that the convergence problem tested in this test is unrelated to the choice of probability=True.

Copy link
Contributor Author

@snath-xoc snath-xoc Sep 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, shall I do the same for other tests as well e.g., test_svc_clone_with_callable_kernel in L1061?

Never mind, it is using predict_proba.

@@ -877,6 +882,16 @@ def __init__(
break_ties=False,
random_state=None,
):
if probability != "deprecated":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be moved to the fit method. As per the scikit-learn estimator API, the constructor parameters should always be stored unchanged as attributes. Any validation logic should be deferred to the fit method.

@jeremiedbb jeremiedbb added the API label Sep 12, 2025
@@ -304,7 +304,7 @@ def fit(self, X, y):
{
"estimators": [
("lr", LogisticRegression()),
("svm", SVC(max_iter=50_000)),
("svm", SVC(probability=True, max_iter=50_000)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? The purpose of the original test was to check that ValueError is raised with a base estimator that does not expose the predict_proba method.

So let's not add a deprecated option to a test that does not require it.

FutureWarning,
)
else:
self.probability = False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per scikit-learn estimator API, fit should not change the value of a public attribute set in the constructor. Instead, define a local variable named probability and set it to self.probability when self.probability != "deprecated" and False otherwise.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then update the rest of the fit method to use the probability local variable instead of self.probability.

@@ -143,15 +144,19 @@ def fit(self, X, y):
X_test_sparse = sparse_container(X_test)
# Trained on sparse format
sparse_classifier = BaggingClassifier(
estimator=CustomSVC(kernel="linear", decision_function_shape="ovr"),
estimator=CustomSVC(
probability=True, kernel="linear", decision_function_shape="ovr"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why enable the deprecated attribute in a test that would not need predict_proba in the first place?

random_state=1,
**params,
).fit(X_train_sparse, y_train)
sparse_results = getattr(sparse_classifier, method)(X_test_sparse)

# Trained on dense format
dense_classifier = BaggingClassifier(
estimator=CustomSVC(kernel="linear", decision_function_shape="ovr"),
estimator=CustomSVC(
probability=True, kernel="linear", decision_function_shape="ovr"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here. This test does should not use the deprecated attribute. If the tests need a base estimator with predict_proba we should instead wrap it with CalibratedClassifierCV.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants