Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@bbrowning
Copy link
Contributor

Newer versions of Docling can raise an exception during custom garbage collection code of TesseractOcrModel that nothing ever catches, because exceptions during garbage collection are generally not caught by anything. The exception is harmless to any of our InstructLab use-cases, and only happens when the TesseractOcrModel was not able to construct itself anyway, which we check for and handle to fallback to EasyOCR or other implementations.

However, py3-unitcov by default fails the test suite if any of these unraisable exceptions happen. This adjusts that to not fail the test suite. Messages still get logged about the exception during regular py3-unit testing, so we can see when this gets fixed in Docling, but there's no need for it to actually fail our test suite.

Issue resolved by this Pull Request:
Resolves #3324

@mergify mergify bot added CI/CD Affects CI/CD configuration testing Relates to testing labels Apr 29, 2025
Newer versions of Docling can raise an exception during custom garbage
collection code of TesseractOcrModel that nothing ever catches,
because exceptions during garbage collection are generally not caught
by anything. The exception is harmless to any of our InstructLab
use-cases, and only happens when the TesseractOcrModel was not able to
construct itself anyway, which we check for and handle to fallback to
EasyOCR or other implementations.

However, py3-unitcov by default fails the test suite if any of these
unraisable exceptions happen. This adjusts that to not fail the test
suite. Messages still get logged about the exception during regular
py3-unit testing, so we can see when this gets fixed in Docling, but
there's no need for it to actually fail our test suite.

Signed-off-by: Ben Browning <[email protected]>
@bbrowning bbrowning force-pushed the docling-gc-exception-ignore branch from 0a7c8a5 to 65c1efb Compare April 29, 2025 17:11
Copy link
Contributor

@booxter booxter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some TODO comments suggesting that this can removed after bug XXX is fixed in docling would be nice but I won't hold for it. Thanks!!!

@mergify mergify bot added the one-approval PR has one approval from a maintainer label Apr 29, 2025
@booxter booxter requested a review from a team April 29, 2025 17:38
@mergify mergify bot removed the one-approval PR has one approval from a maintainer label Apr 29, 2025
@mergify mergify bot merged commit 26258be into instructlab:main Apr 29, 2025
27 checks passed
@bbrowning bbrowning deleted the docling-gc-exception-ignore branch April 29, 2025 18:47
@courtneypacheco
Copy link
Contributor

@mergify backport release-v0.26

@mergify
Copy link
Contributor

mergify bot commented Apr 30, 2025

backport release-v0.26

✅ Backports have been created

Details

@mergify
Copy link
Contributor

mergify bot commented Apr 30, 2025

backport release-v0.26

✅ Backports have been created

Details

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI/CD Affects CI/CD configuration testing Relates to testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unit tests fail with "AttributeError: 'TesseractOcrModel' object has no attribute 'reader'"

4 participants