-
Notifications
You must be signed in to change notification settings - Fork 199
FIX - Improve the error message when the TextEncoder is fitted without installing the additional dependencies #1769
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hi @fxzhou22, thanks for working on this PR. It's already in a good shape, but I made a suggestion on how to keep all the tests in the main The format checks are failing because of issues on our side, I'll deal with that. |
Hi @rcap107 , thank you for the feedback! It should be my fault, but I don't seem to find your suggestion, would you mind pointing me to where the comment is please? However, I looked again into the
In this way, As a result, I remove also Thank you also for the format checks part. |
replace importorskip inside encoder() to avoid the problem and changes test_missing_import_error with direct creation
That's strange 🤔 I tried to ping you in the comment, can you see it?
I think it's better if we use the monkeypatch, rather than modifying the tests that are already there |
Sorry, I really don't find it... I'm quite new here and this is in fact my first contribution, I apologize for it. I'll look into monkeypatch and undo my changes to already existing tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test can be run from the same file using a small monkeypatch like so:
def test_missing_import_error(encoder, monkeypatch):
"""Test that a clear error is raised when sentence_transformers is missing.
We mock the missing dependency by hiding it from sys.modules, then verify
that TextEncoder.fit() raises an ImportError with a helpful message.
"""
monkeypatch.setitem(sys.modules, "sentence_transformers", None)
st = clone(encoder)
x = pd.Series(["oh no"])
err_msg = (
"Missing optional dependency 'sentence_transformers'.*"
"TextEncoder requires sentence-transformers.*"
"install\\.html#deep-learning-dependencies"
)
with pytest.raises(ImportError, match=err_msg):
st.fit(x)By monkeypatching sys.modules we can make it look as if sentence_transformers is missing, and avoid issues with the importorskip at the top.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fxzhou22 can you see this comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, now I get it, thanks a lot!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file can be removed with the inclusion of the monkeypatch in the main test file
It was my fault, I did not submit the review, my bad 😅 You should be able to see it now |
Yes, thanks a lot and no worries! I've tried with your suggested code, it seems to me that
In my opinion, the case in the bottom left corner is the situation/test case that we expect, when the user dosen't have I'll thus commit my changes with |
On the CI, the version of the test that uses monkeypatch runs in the configuration that includes sentence_transformers, and removes it to emulate the situation in which the dependency is missing. So in practice it would work just fine to test the import error.
I do like your solution better, since it's still testing the import error in the other configurations. |
|
Looks good to me, thanks a lot @fxzhou22 |
Thanks for the explication and glad that I've helped. It was a good experience and I've learned a lot. Thank you also for the clear documentation that guided me well wtih my first contribution! |
Hello, this PR would:
TextEncoderis used without installing the optionalsentence_transformersdependency.Changes
skrub/_text_encoder.py: Added explicit error message and reference to installation guidetest_text_encoder_missing.py: New test file created withtest_missing_import_errorNote: It seems to me that in original case,
test_text_encoder.pyhaspytest.importorskip("sentence_transformers")which skips the entire file when the package is not installed. It is the reason why I created a new test file for it.test_text_encoder.py: Added documentation and error message updated fortest_missing_import_errorCHANGES.rst: Added changelog entry