TST: Mark slow tests in astropy/samp #16095

neutrinoceros · 2024-02-23T07:58:04Z

Description

A quick follow up to #16064
I found the 10 longest (non-slow, not parametrized) tests with pytest astropy/ --timer-top-n 20 (with pytest-timer), which is also about the same set of tests that take
longer than a second on my machine (supposedly a pretty fast M2).
On my install, pytest astropy takes about 2'15 on main, and 1'55 with this branch, hence I claim that skipping those 10 tests (out of 24k) might save about 10% CI time.

By checking this box, the PR author has requested that maintainers do NOT use the "Squash and Merge" button. Maintainers should respect this when possible; however, the final decision is at the discretion of the maintainer that merges the PR.

github-actions · 2024-02-23T07:58:32Z

Thank you for your contribution to Astropy! 🌌 This checklist is meant to remind the package maintainers who will review this pull request of some common things to look for.

Do the proposed changes actually accomplish desired goals?
Do the proposed changes follow the Astropy coding guidelines?
Are tests added/updated as required? If so, do they follow the Astropy testing guidelines?
Are docs added/updated as required? If so, do they follow the Astropy documentation guidelines?
Is rebase and/or squash necessary? If so, please provide the author with appropriate instructions. Also see instructions for rebase and squash.
Did the CI pass? If no, are the failures related? If you need to run daily and weekly cron jobs as part of the PR, please apply the "Extra CI" label. Codestyle issues can be fixed by the bot.
Is a change log needed? If yes, did the change log check pass? If no, add the "no-changelog-entry-needed" label. If this is a manual backport, use the "skip-changelog-checks" label unless special changelog handling is necessary.
Is this a big PR that makes a "What's new?" entry worthwhile and if so, is (1) a "what's new" entry included in this PR and (2) the "whatsnew-needed" label applied?
At the time of adding the milestone, if the milestone set requires a backport to release branch(es), apply the appropriate "backport-X.Y.x" label(s) before merge.

github-actions · 2024-02-23T07:58:35Z

👋 Thank you for your draft pull request! Do you know that you can use [ci skip] or [skip ci] in your commit messages to skip running continuous integration tests until you are ready?

pllim · 2024-02-23T14:58:35Z

I'll have to ponder later if the cost of not running this for all combo is worth the benefit. Not like we just sit staring at the CI until it finishes.

mhvk · 2024-02-25T14:35:23Z

I like this idea, but it would be good to get some input on whether these specific tests are things that are likely to fail on just one architecture.

Also, some of the slowest tests may be ones that are excessively parametrized (or the hypothesis ones in astropy.time.tests.test_precision). I think there may be a way to look at those - maybe numpy/numpy#25472 (comment)? But obviously fine to do that later.

pllim

In general, I am okay with marking SAMP as slow, but not comfortable with io.fits and utils. FITS is a can of worms and utils is backbone of many things. I want them both to be tested widely. A bit of slowness in CI is a price I am willing to pay (not that we're paying, har har).

Thank you for your understanding!

mhvk · 2024-02-26T20:24:52Z

astropy/io/fits/hdu/compressed/tests/test_compressed.py

@@ -204,6 +204,7 @@ def test_disable_image_compression(self):
        with fits.open(self.data("comp.fits")) as hdul:
            assert isinstance(hdul[1], fits.CompImageHDU)

+    @pytest.mark.slow


Hmm, looking at the actual test, I think the only reason it is slow is because it sleeps for 1 second! Can we change that to 0.1 s?

It works on my system but I'm not sure how portable this is, maybe the test would fail on slower archs/containers/VMs ? Anyway, time.sleep(1) is used in combinations with pytest.mark.slow in other similar tests, but admittedly a difference is that in this one case, sleep is actually responsible for most of the test's time, so let's try that.

mhvk · 2024-02-26T20:25:11Z

astropy/io/fits/tests/test_groups.py

@@ -48,6 +48,7 @@ def test_open(self):
            assert ghdu.data[0].data.shape == naxes[::-1]
            assert ghdu.data[0].parnames == parameters

+    @pytest.mark.slow


And the same here! It sleeps for 1 second.

mhvk · 2024-02-26T20:25:23Z

astropy/io/fits/tests/test_image.py

@@ -927,6 +927,7 @@ def test_image_update_header(self):

    # The test below raised a `ResourceWarning: unclosed transport` exception
    # due to a bug in Python <=3.10 (cf. cpython#90476)
+    @pytest.mark.slow


I note that even cutting 90% of sleep time in this test only makes it 5 times longer than the second slowest in the module. That's about 0.2s on my machine, which is probably acceptable. Let's try !

Like I said before, I am not comfortable marking anything in io.fits as slow here. I will defer to @saimn .

Well there is no good value here, depends on the system, disk, caching etc.
So 1s is maybe a bit too much but seems reasonable to avoid wasting time debugging issues on CI or on exotic platforms.

mhvk · 2024-02-26T20:26:02Z

astropy/samp/tests/test_hub.py

@@ -17,6 +17,7 @@ def test_SAMPHubServer():
    SAMPHubServer(web_profile=False, mode="multiple", pool_size=1)


+@pytest.mark.slow


And another one that sleeps 1 sec.

The very next test also combines sleep(1) + pytest.mark.slow. In other tests, the file system would be the bottleneck and I think it's reasonable to assume that 1s is (too) generous, but here I'm actually not so sure: starting a server seems like a much more involved task than changing flags on a file.

mhvk · 2024-02-26T20:28:53Z

astropy/utils/tests/test_console.py

@@ -155,6 +155,7 @@ def test_progress_bar_as_generator():
    assert sum == 1225


+@pytest.mark.slow


test_progress_bar_func.func also sleeps (though not as long)

For this one, sleep time is actually not dominant, so I think it should still be marked as slow with no other changes.

mhvk · 2024-02-26T20:36:41Z

Looking through the actual tests, quite a few are slow just because they are sleeping - with 12000 tests, one really should not do even sleep(1) as a matter of course (but then, the FITS tests are not the newest...). I'm less sure why the utils/data tests are so slow; those tests construct whole lists of fake URLs, but by inspection it is not clear how many those actually are.

namurphy · 2024-03-01T22:07:01Z

Over in PlasmaPy, we've started marking tests as slow if they take longer than ∼0.25–0.5 seconds. By doing that, and caching .tox between runs (PlasmaPy/PlasmaPy#2552), we've been able to get our "skip slow" tests to finish in about a minute. That covers ∼90% of our ∼4300 tests. We still run the full test suite once in CI to get code coverage, and we use cron jobs our test suite to cover multiple architectures and versions of Python. For CI, it's been incredibly helpful to get rapid feedback.

This a long way of saying...thank you for doing this! 😺

mhvk · 2024-03-01T23:17:31Z

I like the idea of caching - thanks for linking to that example setup!!

neutrinoceros · 2024-03-11T14:35:03Z

astropy/io/fits/tests/test_image.py

@@ -925,9 +925,6 @@ def test_image_update_header(self):
        with fits.open(self.temp("test0.fits")) as hdul:
            assert (orig_data == hdul[1].data).all()

-    # The test below raised a `ResourceWarning: unclosed transport` exception
-    # due to a bug in Python <=3.10 (cf. cpython#90476)
-    @pytest.mark.filterwarnings("ignore:unclosed transport <asyncio.sslproto")


Since this cleanup is actually orthogonal, I've opened #16183 for it.

neutrinoceros · 2024-03-11T14:38:41Z

Sorry I forgot about this one for a couple weeks !
In addressing @mhvk's review, I actually reverted all slow markers that I previously added in io.fits tests, and instead made them "faster" (or less sleepy, depending how you want to look at it).
@pllim, would you be happy with the change if I just revert all markers I added to utils tests too ?

mhvk

Looks good. I'll approve for all but utils, deferring to @pllim for that.

pllim · 2024-03-11T15:30:25Z

astropy/utils/tests/test_data.py

I also not comfortable in marking utils tests as slow.

Thanks! Much appreciated.

utils address, deferring fits

saimn · 2024-03-12T18:14:16Z

Finally coming to this one (after catching up will the flood of notifications, taking some time off is getting harder...).
So when adding the slow mark, we used to have tests that take way more than 10 sec. (see for example the run with slow tests, https://github.com/astropy/astropy/actions/runs/8011602338/job/21885207661#step:10:1721)

Now for a run without slow tests (https://github.com/astropy/astropy/actions/runs/8011602338/job/21885207469#step:10:1717), it seems more reasonable, with only a few tests takings more than 2 sec. (on CI, which is usually much ower):

4.10s call     docs/timeseries/lombscarglemb.rst::lombscarglemb.rst
3.35s call     .tox/py310-test-alldeps/Lib/site-packages/astropy/coordinates/tests/test_angles.py::test_angle_multithreading
2.34s call     .tox/py310-test-alldeps/Lib/site-packages/astropy/nddata/tests/test_nduncertainty.py::test_for_leak_with_uncertainty
2.13s call     .tox/py310-test-alldeps/Lib/site-packages/astropy/coordinates/tests/test_angles.py::test_str_repr_angles_nan[input0-nan-nan deg-Angle]
2.06s setup    .tox/py310-test-alldeps/Lib/site-packages/astropy/coordinates/tests/test_angles.py::test_str_repr_angles_nan[input0-nan-nan deg-Angle]
2.00s call     .tox/py310-test-alldeps/Lib/site-packages/astropy/io/fits/tests/test_image.py::TestImageFunctions::test_open_scaled_in_update_mode

So unless we decide to skip all tests >1s or even less I would prefer to keep the 3 tests doing a sleep in io.fits.

saimn · 2024-03-12T18:17:28Z

And local results with a faster cpu:

2.85s call     docs/timeseries/lombscarglemb.rst::lombscarglemb.rst
2.01s call     astropy/io/fits/tests/test_image.py::TestImageFunctions::test_open_scaled_in_update_mode
1.90s call     astropy/nddata/tests/test_nduncertainty.py::test_for_leak_with_uncertainty
1.52s call     astropy/timeseries/periodograms/lombscargle_multiband/tests/test_lombscargle_multiband.py::test_unit_conversions[False-flexible]
1.50s call     astropy/timeseries/periodograms/lombscargle_multiband/tests/test_lombscargle_multiband.py::test_unit_conversions[True-flexible]
1.35s call     astropy/time/tests/test_precision.py::test_sidereal_lat_independent[mean]
1.26s teardown astropy/samp/tests/test_web_profile.py::TestWebProfile::test_main
1.12s teardown astropy/samp/tests/test_web_profile.py::TestWebProfile::test_web_profile
1.01s call     astropy/samp/tests/test_hub.py::test_SAMPHubServer_run
1.00s call     astropy/io/fits/hdu/compressed/tests/test_compressed.py::TestCompressedImage::test_open_comp_image_in_update_mode
1.00s call     astropy/io/fits/tests/test_groups.py::TestGroupsFunctions::test_open_groups_in_update_mode
...
================= 28290 passed, 323 skipped, 231 xfailed in 173.34s (0:02:53) ==================

Sure we could speedup even more by removing some tests but it seems difficult to put a limit (and what about tests that run faster but are repeated dozens of time).

neutrinoceros · 2024-03-12T18:24:48Z

Thanks for the insight, I had no idea that 20s tests were a thing 🤯
Let me just revert the changes to io/fits completely so we can all forget about this !

pllim

Thanks for your understanding!

github-actions bot added io.fits utils samp labels Feb 23, 2024

neutrinoceros marked this pull request as ready for review February 23, 2024 08:15

neutrinoceros requested a review from saimn as a code owner February 23, 2024 08:15

pllim added testing no-changelog-entry-needed Performance labels Feb 23, 2024

pllim added this to the v6.1.0 milestone Feb 23, 2024

pllim previously requested changes Feb 26, 2024

View reviewed changes

mhvk reviewed Feb 26, 2024

View reviewed changes

neutrinoceros force-pushed the tests/rfc/mark_slow_tests branch from 5fd66b3 to 6d53ac6 Compare March 11, 2024 14:23

neutrinoceros commented Mar 11, 2024

View reviewed changes

neutrinoceros force-pushed the tests/rfc/mark_slow_tests branch from 6d53ac6 to 93f55eb Compare March 11, 2024 14:48

mhvk approved these changes Mar 11, 2024

View reviewed changes

pllim reviewed Mar 11, 2024

View reviewed changes

neutrinoceros force-pushed the tests/rfc/mark_slow_tests branch from 93f55eb to 40fdec3 Compare March 11, 2024 15:59

neutrinoceros changed the title ~~TST: mark top 10 slowest tests with pytest.mark.slow (save ~10% on CI)~~ TST: Mark slow tests in astropy/samp and reduce sleeping time in long running io/fits tests Mar 11, 2024

pllim removed the utils label Mar 11, 2024

TST: Mark slow tests in astropy/samp

0016bad

neutrinoceros force-pushed the tests/rfc/mark_slow_tests branch from 40fdec3 to 0016bad Compare March 12, 2024 18:26

neutrinoceros changed the title ~~TST: Mark slow tests in astropy/samp and reduce sleeping time in long running io/fits tests~~ TST: Mark slow tests in astropy/samp Mar 12, 2024

pllim approved these changes Mar 12, 2024

View reviewed changes

pllim removed the io.fits label Mar 12, 2024

pllim enabled auto-merge March 12, 2024 18:31

pllim merged commit 2390bf8 into astropy:main Mar 12, 2024

neutrinoceros deleted the tests/rfc/mark_slow_tests branch March 12, 2024 21:55

		@@ -17,6 +17,7 @@ def test_SAMPHubServer():
		SAMPHubServer(web_profile=False, mode="multiple", pool_size=1)


		@pytest.mark.slow

		@@ -155,6 +155,7 @@ def test_progress_bar_as_generator():
		assert sum == 1225


		@pytest.mark.slow

Uh oh!

TST: Mark slow tests in astropy/samp #16095

TST: Mark slow tests in astropy/samp #16095

Uh oh!

Conversation

neutrinoceros commented Feb 23, 2024

Description

Uh oh!

github-actions bot commented Feb 23, 2024

Uh oh!

github-actions bot commented Feb 23, 2024

Uh oh!

pllim commented Feb 23, 2024

Uh oh!

mhvk commented Feb 25, 2024

Uh oh!

pllim left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mhvk commented Feb 26, 2024

Uh oh!

namurphy commented Mar 1, 2024

Uh oh!

mhvk commented Mar 1, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

neutrinoceros commented Mar 11, 2024

Uh oh!

mhvk left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

saimn commented Mar 12, 2024

Uh oh!

saimn commented Mar 12, 2024

Uh oh!

neutrinoceros commented Mar 12, 2024

Uh oh!

pllim left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!