MAINT: add a fuzzing test to try to introduce segfaults #24175

mikedh · 2023-07-13T20:12:10Z

Adds the fuzzing test used to surface two segfaults in #24023 with a slightly cleaned-up loop and check counts sized to finish in less than a minute. Note that this surfaces another intermittent segfault probably in array.choose that is beyond my ability to debug on a Mac:

...
checking method: `byteswap`
checking method: `choose`
zsh: segmentation fault  python test_segfault.py
mikedh@luna tests % python -c "import numpy; print(numpy.__version__)"
1.25.1
mikedh@luna tests % python --version
Python 3.9.2
mikedh@luna tests % uname -a
Darwin luna.localdomain 22.5.0 Darwin Kernel Version 22.5.0: Thu Jun  8 22:22:22 PDT 2023; root:xnu-8796.121.3~7/RELEASE_X86_64 x86_64

While this catches things, especially in a build matrix, it still takes quite a bit of work to hunt down the guilty arguments (especially since segfaults kill everything). I played with writing args to a tempfile but it was unacceptably slow for a unit test. I totally understand if the project doesn't want to run this in the large test matrix.

This is mostly deterministic, but it does check values produced by numpy.random. I can change this to always be seeded from a constant unless that is somehow already done in the test framework.

ngoldbaum

I would name the new test file test_junk_calls.py to be consistent with the test class name, I also think test_junk_calls.py is a little clearer about what it's trying to do.

The new test is pretty slow, about 97 seconds on my machine. Obviously breaking it up into many tests using parametrization and fixtures will make this into many shorter tests, but still, that's a decent amount of overhead.

There's a lot of itertools.product happening, are some of these combinations redundant or unnecessary? Could you use fewer choices in some of the categories you're enumerating over, especially if any are particularly expensive?

I'm not able to reproduce the seg fault you mention in the PR description. If you end up with a reproducer please file a separate issue.

ngoldbaum · 2023-07-13T20:44:52Z

numpy/core/tests/test_segfault.py

+            warnings.filterwarnings("ignore")
+            # loop through the named methods
+            for method in methods:
+                print('checking method: `{}`'.format(method))


better to use a pytest parametrized test so you can get this sort of reporting from pytest itself

ngoldbaum · 2023-07-13T20:45:47Z

numpy/core/tests/test_segfault.py

+        # a list of all methods on the numpy array
+        methods = dir(np.empty(1))
+
+        with warnings.catch_warnings():


If this is supposed to catch floating point warnings from NumPy, probably better to use np.errstate.

There were some errors I didn't see errorstate catching, although admittedly I might be using it incorrectly, I was still seeing these make it through:

test_junk_calls.py:75: RuntimeWarning: overflow encountered in cast

numpy/core/tests/test_segfault.py

mikedh · 2023-07-13T22:21:21Z

Sounds good! I got a reproducer on choose but it's still very intermittent, I'll try to clean it up into a bug report. For this PR:

I'll rename to test_junk_paths.py
Refactor loops to be pytest fixtures
see if we can de-duplicate sample data in a way that still catches the issues surfaced in 1.25.0
aim to reduce test to ~10s

mattip · 2023-07-14T04:47:14Z

Perhaps you could use hypothesis rather than a home-grown fuzzer? It provides a structured way to explore property based testing. It saves state between runs, so will allow capturing input cases that are problematic.

Adding this to every CI run would be too expensive, but maybe we could add a marker to run it only rarely.

seberg · 2023-07-14T06:47:13Z

There is also ossfuzz from google, which this might actually fit into well? (Mainly also pointing it out in case you find it interesting.)
Right now they have something for NumPy, but it only fuzzes boring calls to loadtxt, IIRC: https://github.com/google/oss-fuzz/tree/master/projects/numpy

mikedh · 2023-07-14T20:14:11Z

Yeah, using a fuzzing library could be desirable, although adding a dependency to numpy is above my pay grade 😄. I think there is perhaps an argument that the needs here are a little more special case than a general-purpose fuzzer, and a self-contained script is usually easier to maintain. Either way I made the following changes based on suggestions:

I reduced runtime to 1.2s on my laptop by reducing the number of cases checked (i.e. checking an array with every byte order and dtype) and verified that it was still catching the errors surfaced in numpy==1.25.0
I refactored the argument generation to be a pytest fixture.
I changed the argument generation to use itertools.product and added a check to make sure it wasn't including any duplicates.
I renamed to test_junk_calls.py

seberg · 2023-07-31T11:09:07Z

Do we want to pursue this? I am fine with just putting it in. It found some nice bugs! OTOH, it would be more useful to fuzz also functions and offload it to not run regularly. But if we don't integrate it in the test-suite (and thus it is very fast), I am not usre we will actually end up running it often enough to be useful.

mattip · 2023-07-31T11:11:49Z

Let's discuss it at a community/triage meeting.

seberg · 2023-08-11T08:53:48Z

@ngoldbaum do you want to make a call either way? It seems somewhat useful, but I agree with Matti that the real deal would be some property based (hypothesis) or a dedicated extensive fuzzer (like ossfuzz).

ngoldbaum · 2023-08-11T18:35:34Z

I think it probably makes sense to pull this in, since the test has a much faster runtime than when this PR was initially proposed (although I haven't checked that today). Hypothesis or something would be better, but that requires someone to wire it up, and this already exists and is finding real bugs in poorly tested error paths in numpy.

However, the test failures are real and need to be fixed before this can be merged. There's a segfault on pypy that needs to be looked at, the full tests are failing because of the warning level test, and it looks like the build with assertion turned and some windows builds had issues as well, although the build log has been deleted on azure so I triggered another run.

MAINT: add a fuzzing test to try to introduce segfaults

06f7e73

github-actions bot added the 03 - Maintenance label Jul 13, 2023

mikedh mentioned this pull request Jul 13, 2023

BUG: Segfault When __array_ufunc__ Called With Garbage #24023

Closed

use warnings context manager

4323601

ngoldbaum reviewed Jul 13, 2023

View reviewed changes

This was referenced Jul 13, 2023

BUG: PyObject_IsTrue error checking #24177

Closed

BUG: PyObject_IsTrue and PyObject_Not error handling in setflags #24178

Merged

charris mentioned this pull request Jul 14, 2023

BUG: PyObject_IsTrue and PyObject_Not error handling in setflags #24184

Merged

MAIN: speed up fuzzing test to 1.5s and address suggestions

e731064

mattip added the triage review Issue/PR to be discussed at the next triage meeting label Jul 31, 2023

mikedh mentioned this pull request Aug 19, 2023

BUG: Segfault in trimesh's test suite with numpy 1.26.0b1 #24459

Closed

seberg removed the triage review Issue/PR to be discussed at the next triage meeting label Jan 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

MAINT: add a fuzzing test to try to introduce segfaults #24175

MAINT: add a fuzzing test to try to introduce segfaults #24175

Uh oh!

mikedh commented Jul 13, 2023

Uh oh!

ngoldbaum left a comment

Uh oh!

ngoldbaum Jul 13, 2023

Uh oh!

ngoldbaum Jul 13, 2023

Uh oh!

mikedh Jul 14, 2023

Uh oh!

Uh oh!

mikedh commented Jul 13, 2023

Uh oh!

mattip commented Jul 14, 2023

Uh oh!

seberg commented Jul 14, 2023

Uh oh!

mikedh commented Jul 14, 2023

Uh oh!

seberg commented Jul 31, 2023

Uh oh!

mattip commented Jul 31, 2023

Uh oh!

seberg commented Aug 11, 2023

Uh oh!

ngoldbaum commented Aug 11, 2023

Uh oh!

Uh oh!

Uh oh!

MAINT: add a fuzzing test to try to introduce segfaults #24175

Are you sure you want to change the base?

MAINT: add a fuzzing test to try to introduce segfaults #24175

Uh oh!

Conversation

mikedh commented Jul 13, 2023

Uh oh!

ngoldbaum left a comment

Choose a reason for hiding this comment

Uh oh!

ngoldbaum Jul 13, 2023

Choose a reason for hiding this comment

Uh oh!

ngoldbaum Jul 13, 2023

Choose a reason for hiding this comment

Uh oh!

mikedh Jul 14, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mikedh commented Jul 13, 2023

Uh oh!

mattip commented Jul 14, 2023

Uh oh!

seberg commented Jul 14, 2023

Uh oh!

mikedh commented Jul 14, 2023

Uh oh!

seberg commented Jul 31, 2023

Uh oh!

mattip commented Jul 31, 2023

Uh oh!

seberg commented Aug 11, 2023

Uh oh!

ngoldbaum commented Aug 11, 2023

Uh oh!

Uh oh!