-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
MAINT: add a fuzzing test to try to introduce segfaults #24175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would name the new test file test_junk_calls.py
to be consistent with the test class name, I also think test_junk_calls.py
is a little clearer about what it's trying to do.
The new test is pretty slow, about 97 seconds on my machine. Obviously breaking it up into many tests using parametrization and fixtures will make this into many shorter tests, but still, that's a decent amount of overhead.
There's a lot of itertools.product
happening, are some of these combinations redundant or unnecessary? Could you use fewer choices in some of the categories you're enumerating over, especially if any are particularly expensive?
I'm not able to reproduce the seg fault you mention in the PR description. If you end up with a reproducer please file a separate issue.
numpy/core/tests/test_segfault.py
Outdated
warnings.filterwarnings("ignore") | ||
# loop through the named methods | ||
for method in methods: | ||
print('checking method: `{}`'.format(method)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better to use a pytest parametrized test so you can get this sort of reporting from pytest itself
numpy/core/tests/test_segfault.py
Outdated
# a list of all methods on the numpy array | ||
methods = dir(np.empty(1)) | ||
|
||
with warnings.catch_warnings(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is supposed to catch floating point warnings from NumPy, probably better to use np.errstate
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were some errors I didn't see errorstate
catching, although admittedly I might be using it incorrectly, I was still seeing these make it through:
test_junk_calls.py:75: RuntimeWarning: overflow encountered in cast
Sounds good! I got a reproducer on
|
Perhaps you could use hypothesis rather than a home-grown fuzzer? It provides a structured way to explore property based testing. It saves state between runs, so will allow capturing input cases that are problematic. Adding this to every CI run would be too expensive, but maybe we could add a marker to run it only rarely. |
There is also ossfuzz from google, which this might actually fit into well? (Mainly also pointing it out in case you find it interesting.) |
Yeah, using a fuzzing library could be desirable, although adding a dependency to numpy is above my pay grade 😄. I think there is perhaps an argument that the needs here are a little more special case than a general-purpose fuzzer, and a self-contained script is usually easier to maintain. Either way I made the following changes based on suggestions:
|
Do we want to pursue this? I am fine with just putting it in. It found some nice bugs! OTOH, it would be more useful to fuzz also functions and offload it to not run regularly. But if we don't integrate it in the test-suite (and thus it is very fast), I am not usre we will actually end up running it often enough to be useful. |
Let's discuss it at a community/triage meeting. |
@ngoldbaum do you want to make a call either way? It seems somewhat useful, but I agree with Matti that the real deal would be some property based (hypothesis) or a dedicated extensive fuzzer (like ossfuzz). |
I think it probably makes sense to pull this in, since the test has a much faster runtime than when this PR was initially proposed (although I haven't checked that today). Hypothesis or something would be better, but that requires someone to wire it up, and this already exists and is finding real bugs in poorly tested error paths in numpy. However, the test failures are real and need to be fixed before this can be merged. There's a segfault on pypy that needs to be looked at, the full tests are failing because of the warning level test, and it looks like the build with assertion turned and some windows builds had issues as well, although the build log has been deleted on azure so I triggered another run. |
Adds the fuzzing test used to surface two segfaults in #24023 with a slightly cleaned-up loop and check counts sized to finish in less than a minute. Note that this surfaces another intermittent segfault probably in
array.choose
that is beyond my ability to debug on a Mac:While this catches things, especially in a build matrix, it still takes quite a bit of work to hunt down the guilty arguments (especially since segfaults kill everything). I played with writing args to a tempfile but it was unacceptably slow for a unit test. I totally understand if the project doesn't want to run this in the large test matrix.
This is mostly deterministic, but it does check values produced by
numpy.random
. I can change this to always be seeded from a constant unless that is somehow already done in the test framework.