-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
Doing something about slow tests again #25472
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I propose that we introduce an |
All the PyPy tests are slow, everything else finishes and the three PyPy tests will still be running. It might be useful to check why. @mattip any ideas? |
Can we group parametrized tests? I suspect there are some excessive parametrization and this doesn't look like it would notice them. |
Having an xslow would be fine. It would be nice to make sure that they are run on release wheels, but OK... |
You might also want to consider moving more CI to cron jobs, perhaps the pypy and simd ones, allowing one to trigger those manually by setting a specific label (we do this in astropy for emulated architectures) Many PRs only touch python code, perhaps this can be recognized/labelled automatically, with CI depending on the label? |
FWIW, I wouldn't mind such a setup, the one thing would be nice to auto-open an issue on failure I guess. I wonder if @pllim might just know how to set that up quite quickly? For C vs. Python, I am not sure I think it is worthwhile, running only the most basic tests on PyPy could be moved or not (because it does fail occasionally on larger C-changes). The SIMD/architecture tests would be nice for chron+explicitly though. Wth
The very slowest ons don't seem to affected, OTOH the parametrized ones are often not marked slow and beyond the first ~10 or so, they seem to start to dominate. |
Intriguing. Anything called |
Hello! I am not familiar with numpy tests so I can only say how astropy is doing it and you can adapt it as you see fit for this package.
Hope this helps and happy holidays! |
PyPy is known to be slow on c-extensions. I run weekly tests of PyPy HEAD against common or complicated projects' HEAD in a binary-testing. It probably makes sense for NumPy to limit testing of PyPy to a sampling strategy and not on every PR. |
No need I think, these tests are highly unlikely to fail if they're still run in a regular CI job. So making that a manual step in the release process would be a bit much.
This type of improvement is certainly of interest as a CI improvement I'd say. It's a bit orthogonal to the main goals of this issue though, which are to speed up wheel builds to (a) reduce Cirrus CI costs, and (b) improve on iteration time when debugging CI issues. Moving away from Azure completely falls also in the "desired CI improvements" bucket. The new BLAS CI jobs could bespecial-cased too, like SIMD, docs.
nice, that's a helpful tool.
The main problem is PyPy wheel builds, not regular CI. I think we want to keep these wheels. We could look at not running the full test suite though, only the default (fast) tests. |
Now that cp312 is mainstream we could probably build all the macos wheels in two matrix entries. cp39-cp312 for <14 (i.e. with openblas) and cp39-cp312 for 14>= (i.e. with accelerate). The CI runners have enough grunt to get through them all in under an hour. That would probably reduce cost, but wouldn't improve iteration speed while debugging. To get through the CI builds quicker for linux_aarch64 we could give each matrix entry more CPU, currently they're only given 1 core each. However, if the CPU is underutilised then one doesn't get efficiency gains. w.r.t debugging cirrus-ci - it should be possible to run all the jobs locally if one has a Mac. The cirrus CLI allows one to run the same config on your local computer. I was thinking of writing a guide for how to debug CI configs. |
Probably best not to. It won't make too much difference in overall runtime, just save a couple of minutes for avoiding repo clones and caching conda-forge downloads. Not worth the longer runtime and churn I'd say.
2 cpu's would help I suspect, build time will be almost twice faster, and test suite runtime ~1.7x or so with
That would be very useful I think. Also for other projects to refer to. |
Yes, please! |
See https://github.com/numpy/numpy/wiki/Debugging-CI-guidelines for some basic guidelines for debugging CI configurations. Bear in mind it's a WIP. @rgommers, this should be good for both scipy and numpy. |
The test suite has gradually become a bit slower again, and this makes CI jobs take longer - which now is an additional hassle because of the cost (as of now, $2.75 per wheel build run, see #24280 (comment)). Some jobs are quite slow, and the PyPy on Windows one is ridiculously slow, taking 1h 22m.
Here are the top 200 slowest tests for the
-m full
test suite (which is what the wheel builds run):One of the top offenders is
f2py
, there is already a separate issue for that: gh-25134. Separating out those tests so they don't run at all on wheel builds will take care of that problem.For the rest we should go through and deal with some of the tests in the above list case by case. I'm having a look at
TestStructuredObjectRefcounting
now, which is one of the worst tests.The text was updated successfully, but these errors were encountered: