Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@fxzhou22
Copy link
Contributor

@fxzhou22 fxzhou22 commented Nov 23, 2025

Hello, this PR would:

Changes

  • skrub/_text_encoder.py: Added explicit error message and reference to installation guide
  • test_text_encoder_missing.py: New test file created with test_missing_import_error

Note: It seems to me that in original case, test_text_encoder.py has pytest.importorskip("sentence_transformers") which skips the entire file when the package is not installed. It is the reason why I created a new test file for it.

  • test_text_encoder.py: Added documentation and error message updated for test_missing_import_error
  • CHANGES.rst: Added changelog entry

@fxzhou22 fxzhou22 changed the title Fix issue 1763 FIX - Improve the error message when the TextEncoder is fitted without installing the additional dependencies Nov 23, 2025
@rcap107
Copy link
Member

rcap107 commented Nov 24, 2025

Hi @fxzhou22, thanks for working on this PR. It's already in a good shape, but I made a suggestion on how to keep all the tests in the main test_text_encoder.py file and avoid trouble with importorskip.

The format checks are failing because of issues on our side, I'll deal with that.

@fxzhou22
Copy link
Contributor Author

Hi @fxzhou22, thanks for working on this PR. It's already in a good shape, but I made a suggestion on how to keep all the tests in the main test_text_encoder.py file and avoid trouble with importorskip.

The format checks are failing because of issues on our side, I'll deal with that.

Hi @rcap107 , thank you for the feedback! It should be my fault, but I don't seem to find your suggestion, would you mind pointing me to where the comment is please? However, I looked again into the importorskip problem, and I've resolved it by :

  • In test_text_encoder:
    • Moving importorskip into the encoder fixture
    • Removing encoder as parameter for test_missing_import_error
    • Creating directly TextEncoder instance for test_missing_import_error

In this way, test_missing_import_error is executed and other tests are skipped. I hope this is consistent with your suggestion?

As a result, I remove also test_text_encoder_missing.py. Sorry I didn't think of this solution before.

Thank you also for the format checks part.

replace importorskip inside encoder() to avoid the problem and changes test_missing_import_error with direct creation
@rcap107
Copy link
Member

rcap107 commented Nov 24, 2025

Hi @rcap107 , thank you for the feedback! It should be my fault, but I don't seem to find your suggestion, would you mind pointing me to where the comment is please?

That's strange 🤔 I tried to ping you in the comment, can you see it?

However, I looked again into the importorskip problem, and I've resolved it by :
...
In this way, test_missing_import_error is executed and other tests are skipped. I hope this is consistent with your suggestion?

I think it's better if we use the monkeypatch, rather than modifying the tests that are already there

@fxzhou22
Copy link
Contributor Author

Hi @rcap107 , thank you for the feedback! It should be my fault, but I don't seem to find your suggestion, would you mind pointing me to where the comment is please?

That's strange 🤔 I tried to ping you in the comment, can you see it?

However, I looked again into the importorskip problem, and I've resolved it by :
...
In this way, test_missing_import_error is executed and other tests are skipped. I hope this is consistent with your suggestion?

I think it's better if we use the monkeypatch, rather than modifying the tests that are already there

Sorry, I really don't find it... I'm quite new here and this is in fact my first contribution, I apologize for it.

I'll look into monkeypatch and undo my changes to already existing tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test can be run from the same file using a small monkeypatch like so:

def test_missing_import_error(encoder, monkeypatch):
    """Test that a clear error is raised when sentence_transformers is missing.
    
    We mock the missing dependency by hiding it from sys.modules, then verify
    that TextEncoder.fit() raises an ImportError with a helpful message.
    """
    monkeypatch.setitem(sys.modules, "sentence_transformers", None)

    st = clone(encoder)
    x = pd.Series(["oh no"])

    err_msg = (
        "Missing optional dependency 'sentence_transformers'.*"
        "TextEncoder requires sentence-transformers.*"
        "install\\.html#deep-learning-dependencies"
    )
    with pytest.raises(ImportError, match=err_msg):
        st.fit(x)

By monkeypatching sys.modules we can make it look as if sentence_transformers is missing, and avoid issues with the importorskip at the top.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fxzhou22 can you see this comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, now I get it, thanks a lot!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file can be removed with the inclusion of the monkeypatch in the main test file

@rcap107
Copy link
Member

rcap107 commented Nov 24, 2025

Sorry, I really don't find it... I'm quite new here and this is in fact my first contribution, I apologize for it.

It was my fault, I did not submit the review, my bad 😅

You should be able to see it now

@fxzhou22
Copy link
Contributor Author

Sorry, I really don't find it... I'm quite new here and this is in fact my first contribution, I apologize for it.

It was my fault, I did not submit the review, my bad 😅

You should be able to see it now

Yes, thanks a lot and no worries! I've tried with your suggested code, it seems to me that importorskip on the top of the file still skips test_missing_import_error when I don't have sentence-transformers installed. I then tried with sentence-transformers installed, the tests go well as expected. Here's a summary of the situations:

sentence_transformers NOT installed sentence_transformers installed
importorskip at module level, test_missing_import_error(encoder, monkeypatch) 2 tests skipped All 41 tests passed
importorskip in encoder fixture, test_missing_import_error(monkeypatch) Runs test_missing_import_error and skips other 40 tests All 41 tests passed

In my opinion, the case in the bottom left corner is the situation/test case that we expect, when the user dosen't have sentence_transformers installed, it shows well the explicit message error.

I'll thus commit my changes with importorskip in encoder fixture, test_missing_import_error(monkeypatch). I hope this meets your expectations.

@rcap107
Copy link
Member

rcap107 commented Nov 25, 2025

Yes, thanks a lot and no worries! I've tried with your suggested code, it seems to me that importorskip on the top of the file still skips test_missing_import_error when I don't have sentence-transformers installed. I then tried with sentence-transformers installed, the tests go well as expected. Here's a summary of the situations:

sentence_transformers NOT installed sentence_transformers installed
importorskip at module level, test_missing_import_error(encoder, monkeypatch) 2 tests skipped All 41 tests passed
importorskip in encoder fixture, test_missing_import_error(monkeypatch) Runs test_missing_import_error and skips other 40 tests All 41 tests passed
In my opinion, the case in the bottom left corner is the situation/test case that we expect, when the user dosen't have sentence_transformers installed, it shows well the explicit message error.

On the CI, the version of the test that uses monkeypatch runs in the configuration that includes sentence_transformers, and removes it to emulate the situation in which the dependency is missing. So in practice it would work just fine to test the import error.

I'll thus commit my changes with importorskip in encoder fixture, test_missing_import_error(monkeypatch). I hope this meets your expectations.

I do like your solution better, since it's still testing the import error in the other configurations.

@rcap107
Copy link
Member

rcap107 commented Nov 25, 2025

Looks good to me, thanks a lot @fxzhou22

@rcap107 rcap107 merged commit ba98a87 into skrub-data:main Nov 25, 2025
29 checks passed
@fxzhou22
Copy link
Contributor Author

Yes, thanks a lot and no worries! I've tried with your suggested code, it seems to me that importorskip on the top of the file still skips test_missing_import_error when I don't have sentence-transformers installed. I then tried with sentence-transformers installed, the tests go well as expected. Here's a summary of the situations:
sentence_transformers NOT installed sentence_transformers installed
importorskip at module level, test_missing_import_error(encoder, monkeypatch) 2 tests skipped All 41 tests passed
importorskip in encoder fixture, test_missing_import_error(monkeypatch) Runs test_missing_import_error and skips other 40 tests All 41 tests passed
In my opinion, the case in the bottom left corner is the situation/test case that we expect, when the user dosen't have sentence_transformers installed, it shows well the explicit message error.

On the CI, the version of the test that uses monkeypatch runs in the configuration that includes sentence_transformers, and removes it to emulate the situation in which the dependency is missing. So in practice it would work just fine to test the import error.

I'll thus commit my changes with importorskip in encoder fixture, test_missing_import_error(monkeypatch). I hope this meets your expectations.

I do like your solution better, since it's still testing the import error in the other configurations.

Thanks for the explication and glad that I've helped. It was a good experience and I've learned a lot. Thank you also for the clear documentation that guided me well wtih my first contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve the error message when the TextEncoder is fitted without installing the additional dependencies

2 participants