Doing something about slow tests again #25472

rgommers · 2023-12-22T19:41:23Z

The test suite has gradually become a bit slower again, and this makes CI jobs take longer - which now is an additional hassle because of the cost (as of now, $2.75 per wheel build run, see #24280 (comment)). Some jobs are quite slow, and the PyPy on Windows one is ridiculously slow, taking 1h 22m.

Here are the top 200 slowest tests for the -m full test suite (which is what the wheel builds run):

====================================== slowest 200 durations ======================================
11.75s call     numpy/lib/tests/test_io.py::TestSaveTxt::test_large_zip
7.19s call     numpy/random/tests/test_extending.py::test_cython
6.73s call     numpy/_core/tests/test_mem_overlap.py::test_may_share_memory_easy_fuzz
6.41s call     numpy/_core/tests/test_multiarray.py::TestDot::test_huge_vectordot[complex128]
6.02s call     numpy/linalg/tests/test_linalg.py::TestCond::test_generalized_sq_cases
5.39s call     numpy/distutils/tests/test_build_ext.py::test_multi_fortran_libs_link
5.28s call     numpy/_core/tests/test_mem_overlap.py::test_may_share_memory_harder_fuzz
5.01s setup    numpy/_core/tests/test_cython.py::test_is_timedelta64_object
4.99s call     numpy/_core/tests/test_multiarray.py::TestBool::test_count_nonzero_all
3.90s call     numpy/_core/tests/test_mem_overlap.py::test_diophantine_fuzz
3.58s call     numpy/lib/tests/test_io.py::TestSavezLoad::test_big_arrays
3.46s call     numpy/tests/test_warnings.py::test_warning_calls
3.39s call     numpy/lib/tests/test_format.py::test_large_archive
3.18s call     numpy/_core/tests/test_mem_overlap.py::TestUFunc::test_unary_ufunc_call_fuzz
2.88s call     numpy/testing/tests/test_utils.py::TestAssertNoGcCycles::test_asserts
2.88s call     numpy/testing/tests/test_utils.py::TestAssertNoGcCycles::test_passes
2.82s call     numpy/_core/tests/test_dtype.py::TestDTypeMakeCanonical::test_structured
2.36s call     numpy/lib/tests/test_io.py::test_load_refcount
2.24s call     numpy/lib/tests/test_function_base.py::TestLeaks::test_frompyfunc_leaks[bound-20]
2.24s call     numpy/lib/tests/test_function_base.py::TestLeaks::test_frompyfunc_leaks[unbound-0]
2.23s call     numpy/random/tests/test_generator_mt19937.py::TestIntegers::test_integers_small_dtype_chisquared[50000000-5000-uint16-6500.0]
2.10s call     numpy/random/tests/test_generator_mt19937.py::TestRandomDist::test_dirichlet_moderately_small_alpha
2.04s call     numpy/linalg/tests/test_linalg.py::TestDet::test_generalized_sq_cases
1.90s setup    numpy/f2py/tests/test_character.py::TestCharacterString::test_input[1]
1.83s setup    numpy/f2py/tests/test_parameter.py::TestParameters::test_constant_real_single
1.82s setup    numpy/f2py/tests/test_return_character.py::TestFReturnCharacter::test_all_f77[t0]
1.81s setup    numpy/f2py/tests/test_return_logical.py::TestFReturnLogical::test_all_f77[t0]
1.81s setup    numpy/f2py/tests/test_crackfortran.py::TestDimSpec::test_array_size[n]
1.80s setup    numpy/f2py/tests/test_return_integer.py::TestFReturnInteger::test_all_f77[t0]
1.80s setup    numpy/f2py/tests/test_callback.py::TestF77Callback::test_all[t]
1.80s setup    numpy/f2py/tests/test_callback.py::TestF90Callback::test_gh17797
1.80s setup    numpy/f2py/tests/test_return_complex.py::TestFReturnComplex::test_all_f77[t0]
1.80s setup    numpy/f2py/tests/test_return_real.py::TestFReturnReal::test_all_f77[t0]
1.80s setup    numpy/f2py/tests/test_regression.py::TestModuleAndSubroutine::test_gh25337
1.78s setup    numpy/f2py/tests/test_character.py::TestCharacter::test_input[c]
1.77s setup    numpy/f2py/tests/test_common.py::TestCommonWithUse::test_common_gh19161
1.77s setup    numpy/f2py/tests/test_string.py::TestDocStringArguments::test_example
1.76s setup    numpy/f2py/tests/test_character.py::TestMiscCharacter::test_gh18684
1.76s setup    numpy/f2py/tests/test_common.py::TestCommonBlock::test_common_block
1.76s setup    numpy/f2py/tests/test_callback.py::TestF77CallbackPythonTLS::test_all[t]
1.76s setup    numpy/f2py/tests/test_assumed_shape.py::TestAssumedShapeSumExample::test_all
1.76s setup    numpy/f2py/tests/test_assumed_shape.py::TestF2cmapOption::test_all
1.76s setup    numpy/f2py/tests/test_quoted_character.py::TestQuotedCharacter::test_quoted_character
1.75s setup    numpy/f2py/tests/test_string.py::TestFixedString::test_intent_in
1.75s setup    numpy/f2py/tests/test_size.py::TestSizeSumExample::test_all
1.75s setup    numpy/f2py/tests/test_crackfortran.py::TestCrackFortran::test_gh2848
1.75s setup    numpy/f2py/tests/test_f2cmap.py::TestF2Cmap::test_gh15095
1.75s setup    numpy/f2py/tests/test_return_real.py::TestCReturnReal::test_all[t4]
1.75s setup    numpy/f2py/tests/test_callback.py::TestGH25211::test_gh25211
1.75s setup    numpy/f2py/tests/test_data.py::TestDataMultiplierF77::test_data_stmts
1.74s setup    numpy/f2py/tests/test_semicolon_split.py::TestMultiline::test_multiline
1.74s setup    numpy/f2py/tests/test_isoc.py::TestISOC::test_c_double
1.74s call     numpy/linalg/tests/test_linalg.py::TestPinv::test_generalized_nonsq_cases
1.74s setup    numpy/f2py/tests/test_value_attrspec.py::TestValueAttr::test_gh21665
1.74s setup    numpy/f2py/tests/test_character.py::TestStringOptionalInOut::test_gh24662
1.74s setup    numpy/f2py/tests/test_crackfortran.py::TestFunctionReturn::test_function_rettype
1.74s setup    numpy/f2py/tests/test_mixed.py::TestMixed::test_all
1.74s setup    numpy/f2py/tests/test_module_doc.py::TestModuleDocString::test_module_docstring
1.74s setup    numpy/f2py/tests/test_callback.py::TestGH18335::test_gh18335
1.73s setup    numpy/f2py/tests/test_abstract_interface.py::TestAbstractInterface::test_abstract_interface
1.73s setup    numpy/f2py/tests/test_data.py::TestDataWithCommentsF77::test_data_stmts
1.73s setup    numpy/f2py/tests/test_kind.py::TestKind::test_int
1.73s setup    numpy/f2py/tests/test_regression.py::TestNumpyVersionAttribute::test_numpy_version_attribute
1.73s setup    numpy/f2py/tests/test_crackfortran.py::TestNoSpace::test_module
1.73s setup    numpy/f2py/tests/test_crackfortran.py::TestExternal::test_external_as_statement
1.73s setup    numpy/f2py/tests/test_character.py::TestBCCharHandling::test_gh25286
1.73s setup    numpy/f2py/tests/test_crackfortran.py::TestUnicodeComment::test_encoding_comment
1.73s setup    numpy/f2py/tests/test_character.py::TestNewCharHandling::test_gh25286
1.73s call     numpy/testing/tests/test_utils.py::TestAssertNoGcCycles::test_fails
1.73s setup    numpy/f2py/tests/test_character.py::TestStringAssumedLength::test_gh24008
1.73s setup    numpy/f2py/tests/test_docs.py::TestDocAdvanced::test_asterisk1
1.73s call     numpy/tests/test_ctypeslib.py::TestAsArray::test_reference_cycles
1.72s setup    numpy/f2py/tests/test_character.py::TestStringScalarArr::test_char
1.72s setup    numpy/f2py/tests/test_string.py::TestString::test_char
1.72s setup    numpy/f2py/tests/test_data.py::TestDataF77::test_data_stmts
1.72s setup    numpy/f2py/tests/test_block_docstring.py::TestBlockDocString::test_block_docstring
1.72s setup    numpy/f2py/tests/test_semicolon_split.py::TestCallstatement::test_callstatement
1.71s setup    numpy/f2py/tests/test_data.py::TestData::test_data_stmts
1.71s setup    numpy/f2py/tests/test_regression.py::TestIntentInOut::test_inout
1.71s setup    numpy/f2py/tests/test_regression.py::TestNegativeBounds::test_negbound
1.69s call     numpy/_core/tests/test_extint128.py::test_divmod_128_64
1.69s call     numpy/_core/tests/test_mem_overlap.py::TestUFunc::test_binary_ufunc_1d_manual
1.64s call     numpy/_core/tests/test_regression.py::TestRegression::test_structarray_title
1.59s call     numpy/_core/tests/test_nditer.py::test_iter_buffered_reduce_reuse
1.58s call     numpy/_core/tests/test_multiarray.py::TestDot::test_huge_vectordot[float64]
1.54s setup    numpy/f2py/tests/test_array_from_pyobj.py::TestIntent::test_in_out
1.18s call     numpy/array_api/tests/test_array_object.py::test_operators
1.17s call     numpy/linalg/tests/test_linalg.py::TestPinv::test_generalized_sq_cases
1.04s call     numpy/linalg/tests/test_linalg.py::test_sdot_bug_8577
1.00s call     numpy/linalg/tests/test_linalg.py::TestEigvals::test_generalized_sq_cases
0.84s call     numpy/linalg/tests/test_linalg.py::TestInv::test_generalized_sq_cases
0.83s call     numpy/_core/tests/test_multiarray.py::TestPickling::test_roundtrip
0.81s call     numpy/_core/tests/test_ufunc.py::TestUfunc::test_identityless_reduction_huge_array
0.80s call     numpy/linalg/tests/test_linalg.py::TestSolve::test_generalized_sq_cases
0.72s call     numpy/linalg/tests/test_linalg.py::TestEig::test_generalized_sq_cases
0.71s setup    numpy/_core/tests/test_array_interface.py::test_cstruct
0.68s call     numpy/_core/tests/test_mem_overlap.py::TestUFunc::test_unary_gufunc_fuzz
0.67s call     numpy/_core/tests/test_multiarray.py::TestArrayFinalize::test_lifetime_on_error
0.67s call     numpy/_core/tests/test_multiarray.py::TestAlignment::test_various_alignments
0.66s call     numpy/linalg/tests/test_linalg.py::TestSVD::test_generalized_sq_cases
0.66s call     numpy/_core/tests/test_scalarmath.py::TestModulus::test_float_modulus_exact
0.65s call     numpy/f2py/tests/test_crackfortran.py::TestNameArgsPatternBacktracking::test_nameargspattern_backtracking[@)@bind                         @(@]
0.64s setup    numpy/_core/tests/test_mem_policy.py::test_set_policy
0.62s call     numpy/lib/tests/test_format.py::test_huge_header_npz
0.60s call     numpy/random/tests/test_generator_mt19937_regressions.py::TestRegression::test_shuffle_of_array_of_different_length_strings
0.58s teardown numpy/typing/tests/test_typing.py::test_extended_precision
0.58s call     numpy/random/tests/test_regression.py::TestRegression::test_shuffle_of_array_of_different_length_strings
0.57s call     numpy/random/tests/test_randomstate_regression.py::TestRegression::test_shuffle_of_array_of_different_length_strings
0.57s call     numpy/random/tests/test_regression.py::TestRegression::test_shuffle_of_array_of_objects
0.57s call     numpy/random/tests/test_randomstate_regression.py::TestRegression::test_shuffle_of_array_of_objects
0.55s call     numpy/random/tests/test_generator_mt19937_regressions.py::TestRegression::test_shuffle_of_array_of_objects
0.54s call     numpy/_core/tests/test_limited_api.py::test_limited_api
0.49s call     numpy/_core/tests/test_dtype.py::TestDTypeMakeCanonical::test_make_canonical_hypothesis
0.47s call     numpy/_core/tests/test_multiarray.py::TestUnicodeEncoding::test_round_trip
0.47s call     numpy/_core/tests/test_mem_overlap.py::test_internal_overlap_fuzz
0.46s call     numpy/_core/tests/test_numeric.py::TestCreationFuncs::test_full
0.45s setup    numpy/f2py/tests/test_array_from_pyobj.py::TestSharedMemory::test_hidden[DOUBLE]
0.44s call     numpy/_core/tests/test_numeric.py::TestClip::test_clip_property
0.42s call     numpy/_core/tests/test_multiarray.py::TestCreation::test_zeros_big
0.42s call     numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_take_and_repeat[<structured subarray 1>]
0.42s call     numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_indexing[shape0-index0-2-<subarray in field>]
0.42s call     numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_indexing[shape1-index1-4-<structured subarray 1>]
0.42s call     numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_item_setting[<structured subarray 2>]
0.41s call     numpy/_core/tests/test_mem_policy.py::test_owner_is_base
0.41s call     numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_indexing[shape0-index0-2-<subarray>]
0.41s call     numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_indexing[shape1-index1-4-<subarray>]
0.41s call     numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_take_and_repeat[<subarray in field>]
0.41s call     numpy/random/tests/test_generator_mt19937.py::TestIntegers::test_integers_small_dtype_chisquared[10000000-2500-int16-3300.0]
0.41s call     numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_indexing[shape2-index2-2-<subarray in field>]
0.41s call     numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_indexing[shape0-index0-2-<structured subarray 1>]
0.41s call     numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_indexing[shape1-index1-4-<subarray in field>]
0.41s call     numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_indexing[shape0-index0-2-<structured subarray 2>]
0.41s call     numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_item_setting[<subarray in field>]
0.40s call     numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_item_setting[<subarray>]
0.40s call     numpy/_core/tests/test_cpu_features.py::TestEnvPrivation::test_runtime_feature_selection
0.40s call     numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_take_and_repeat[<subarray>]
0.39s call     numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_take_and_repeat[<structured subarray 2>]
0.39s call     numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_indexing[shape1-index1-4-<structured subarray 2>]
0.39s call     numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_item_setting[<structured subarray 1>]
0.38s call     numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_indexing[shape3-index3-2-<structured subarray 2>]
0.38s call     numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_indexing[shape2-index2-2-<structured subarray 2>]
0.38s call     numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_indexing[shape2-index2-2-<subarray>]
0.38s call     numpy/_core/tests/test_mem_overlap.py::TestUFunc::test_unary_ufunc_1d_manual
0.38s call     numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_indexing[shape3-index3-2-<subarray in field>]
0.37s call     numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_indexing[shape3-index3-2-<subarray>]
0.37s call     numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_indexing[shape2-index2-2-<structured subarray 1>]
0.37s call     numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_indexing[shape3-index3-2-<structured subarray 1>]
0.35s call     numpy/random/tests/test_randomstate.py::test_integer_repeat[multinomial]
0.35s call     numpy/_core/tests/test_regression.py::TestRegression::test_leak_in_structured_dtype_comparison
0.34s call     numpy/_core/tests/test_arrayprint.py::TestArray2String::test_any_text
0.34s call     numpy/_core/tests/test_multiarray.py::TestDot::test_accelerate_framework_sgemv_fix
0.30s call     numpy/_core/tests/test_multiarray.py::test_partition_fp[float16-N95]
0.28s call     numpy/_core/tests/test_mem_overlap.py::TestUFunc::test_unary_ufunc_call_complex_fuzz
0.27s call     numpy/_core/tests/test_multiarray.py::TestMethods::test_partition
0.26s call     numpy/_core/tests/test_umath.py::TestAbsoluteNegative::test_abs_neg_blocked
0.26s call     numpy/_core/tests/test_multiarray.py::TestMethods::test_sort_degraded
0.24s call     numpy/lib/tests/test_function_base.py::TestQuantile::test_quantile_monotonic_hypo
0.24s call     numpy/_core/tests/test_api.py::test_copyto_permut
0.23s call     numpy/ma/tests/test_core.py::TestMaskedArrayMathMethods::test_mean_overflow
0.23s call     numpy/_core/tests/test_mem_policy.py::test_switch_owner[0]
0.23s call     numpy/_core/tests/test_multiarray.py::TestCTypes::test_ctypes_as_parameter_holds_reference
0.23s call     numpy/tests/test_reloading.py::test_full_reimport
0.23s call     numpy/_core/tests/test_multiarray.py::TestCTypes::test_ctypes_data_as_holds_reference[overlapping]
0.23s call     numpy/_core/tests/test_multiarray.py::TestCTypes::test_ctypes_data_as_holds_reference[readonly]
0.23s call     numpy/_core/tests/test_scalarmath.py::TestBaseMath::test_blocked
0.22s call     numpy/_core/tests/test_multiarray.py::TestCTypes::test_ctypes_data_as_holds_reference[empty]
0.22s call     numpy/_core/tests/test_multiarray.py::TestCTypes::test_ctypes_data_as_holds_reference[structured]
0.22s call     numpy/_core/tests/test_multiarray.py::TestCTypes::test_ctypes_data_as_holds_reference[empty-2d]
0.22s call     numpy/_core/tests/test_multiarray.py::TestCTypes::test_ctypes_data_as_holds_reference[1d]
0.22s call     numpy/_core/tests/test_mem_policy.py::test_switch_owner[1]
0.22s call     numpy/_core/tests/test_multiarray.py::TestCTypes::test_ctypes_data_as_holds_reference[2d]
0.22s call     numpy/_core/tests/test_mem_policy.py::test_switch_owner[None]
0.22s call     numpy/_core/tests/test_multiarray.py::TestCTypes::test_ctypes_data_as_holds_reference[object]
0.22s call     numpy/_core/tests/test_indexing.py::TestMultiIndexingAutomated::test_multidim
0.22s call     numpy/lib/tests/test_histograms.py::TestHistogramOptimBinNums::test_scott_vs_stone
0.21s call     numpy/lib/tests/test_nanfunctions.py::TestNanFunctions_Median::test_float_special
0.20s call     numpy/lib/tests/test_format.py::test_huge_header[r]
0.20s call     numpy/_core/tests/test_extint128.py::test_safe_binop
0.20s call     numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_create_delete[ones-1-<subarray in field>]
0.20s call     numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_create_delete[ones-1-<structured subarray 1>]
0.20s call     numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_create_delete[ones-1-<subarray>]
0.19s call     numpy/_core/tests/test_umath.py::TestAVXFloat32Transcendental::test_sincos_float32
0.19s call     numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_create_delete[zeros-0-<subarray>]
0.19s call     numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_create_delete[zeros-0-<structured subarray 2>]
0.19s call     numpy/array_api/tests/test_set_functions.py::test_inverse_indices_shape[unique_inverse]
0.19s call     numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_create_delete[ones-1-<structured subarray 2>]
0.19s call     numpy/tests/test_scripts.py::test_pep338
0.18s call     numpy/_core/tests/test_arrayprint.py::TestArray2String::test_refcount
0.18s call     numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_create_delete[zeros-0-<subarray in field>]
0.18s call     numpy/_core/tests/test_multiarray.py::TestIO::test_largish_file[path_obj]
0.18s call     numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_create_delete[zeros-0-<structured subarray 1>]
0.18s call     numpy/fft/tests/test_helper.py::TestFFTShift::test_equal_to_original
0.18s call     numpy/_core/tests/test_multiarray.py::TestIO::test_largish_file[string]
0.18s call     numpy/distutils/tests/test_exec_command.py::TestExecCommand::test_basic
0.18s call     numpy/_core/tests/test_umath_accuracy.py::TestAccuracy::test_validate_transcendentals
0.18s call     numpy/tests/test_scripts.py::test_f2py[f2py]
0.18s call     numpy/_core/tests/test_multiarray.py::TestBool::test_count_nonzero
0.17s call     numpy/linalg/tests/test_linalg.py::TestEighCases::test_generalized_herm_cases
0.17s call     numpy/tests/test_public_api.py::test_import_lazy_import[testing]
0.17s call     numpy/linalg/tests/test_linalg.py::TestCond::test_sq_cases
============== 45452 passed, 376 skipped, 36 xfailed, 4 xpassed in 343.20s (0:05:43) ==============

One of the top offenders is f2py, there is already a separate issue for that: gh-25134. Separating out those tests so they don't run at all on wheel builds will take care of that problem.

For the rest we should go through and deal with some of the tests in the above list case by case. I'm having a look at TestStructuredObjectRefcounting now, which is one of the worst tests.

The text was updated successfully, but these errors were encountered:

rgommers · 2023-12-22T19:53:37Z

I propose that we introduce an @pytest.mark.xslow decorator, just like SciPy does. And then run those tests only in a single dedicated CI job, not in the full test suite. That solves the problem with the slowest tests. It makes very little sense for something like TestSaveTxt::test_large_zip, which takes ~10 sec. and never fails, to run ~30 times for a single commit where we trigger a wheel build.

charris · 2023-12-22T20:12:17Z

and the PyPy on Windows one is ridiculously slow

All the PyPy tests are slow, everything else finishes and the three PyPy tests will still be running. It might be useful to check why. @mattip any ideas?

seberg · 2023-12-22T20:18:22Z

Can we group parametrized tests? I suspect there are some excessive parametrization and this doesn't look like it would notice them.

seberg · 2023-12-22T20:19:33Z

Having an xslow would be fine. It would be nice to make sure that they are run on release wheels, but OK...

mhvk · 2023-12-23T14:14:05Z

You might also want to consider moving more CI to cron jobs, perhaps the pypy and simd ones, allowing one to trigger those manually by setting a specific label (we do this in astropy for emulated architectures)

Many PRs only touch python code, perhaps this can be recognized/labelled automatically, with CI depending on the label?

seberg · 2023-12-23T15:57:02Z

FWIW, I wouldn't mind such a setup, the one thing would be nice to auto-open an issue on failure I guess. I wonder if @pllim might just know how to set that up quite quickly? For C vs. Python, I am not sure I think it is worthwhile, running only the most basic tests on Documentation marked PRs would be cool though.

PyPy could be moved or not (because it does fail occasionally on larger C-changes). The SIMD/architecture tests would be nice for chron+explicitly though.

Wth pytest-durations-extra and --functions-durations, I get this:

=============================================================== slowest test functions durations ===============================================================
64.88s numpy/_pyinstaller/test_pyinstaller.py::test_pyinstaller
23.46s numpy/lib/tests/test_io.py::TestSaveTxt::test_large_zip
13.21s numpy/linalg/tests/test_linalg.py::TestCond::test_generalized_sq_cases
12.88s numpy/lib/tests/test_io.py::TestSavezLoad::test_big_arrays
12.76s numpy/_core/tests/test_mem_overlap.py::test_may_share_memory_easy_fuzz
10.22s numpy/_core/tests/test_mem_overlap.py::test_may_share_memory_harder_fuzz
9.16s numpy/_core/tests/test_multiarray.py::TestBool::test_count_nonzero_all
8.12s numpy/_core/tests/test_dtype.py::TestStructuredObjectRefcounting::test_structured_object_indexing
7.88s numpy/_core/tests/test_strings.py::test_string_comparisons
7.60s numpy/distutils/tests/test_build_ext.py::test_multi_fortran_libs_link
7.45s numpy/lib/tests/test_format.py::test_large_archive
6.84s numpy/_core/tests/test_multiarray.py::test_partition_int
6.67s numpy/_core/tests/test_mem_overlap.py::test_diophantine_fuzz
6.52s numpy/lib/tests/test_function_base.py::TestLeaks::test_frompyfunc_leaks
6.26s numpy/random/tests/test_extending.py::test_cython
6.05s numpy/ma/tests/test_core.py::TestMaskedArrayArithmetic::test_comparisons_for_numeric
5.43s numpy/_core/tests/test_mem_overlap.py::TestUFunc::test_unary_ufunc_call_fuzz
5.25s numpy/tests/test_warnings.py::test_warning_calls
4.91s numpy/random/tests/test_generator_mt19937.py::TestIntegers::test_integers_small_dtype_chisquared
4.73s numpy/_core/tests/test_cython.py::test_is_timedelta64_object
================================================================== sum of all tests durations ==================================================================
575.43s
============================================ 44051 passed, 1784 skipped, 36 xfailed, 4 xpassed in 707.82s (0:11:47) ============================================

The very slowest ons don't seem to affected, OTOH the parametrized ones are often not marked slow and beyond the first ~10 or so, they seem to start to dominate.

mhvk · 2023-12-23T16:32:48Z

Intriguing. Anything called _fuzz would seem fair game (and perhaps not particularly logical to run on PRs).

pllim · 2023-12-23T17:08:58Z

Hello! I am not familiar with numpy tests so I can only say how astropy is doing it and you can adapt it as you see fit for this package.

pyinstaller -- We have it run in a daily cron job over at https://github.com/astropy/astropy/blob/b50e9cfc8bf862816ea9ddd7193f0cf801b44c71/.github/workflows/ci_cron_daily.yml#L46
slow tests -- saimn implemented markers for "slow" and "hugemem" in Use pytest markers for intensive tests astropy/astropy#12764 . The markers are in https://github.com/astropy/pytest-astropy/blob/main/pytest_astropy/plugin.py . Looks like we still run the "slow" tests in regular CI but only for selected jobs but we never run "hugemem" in the CI at all but provide instructions for devs to run those manually as needed. This does not mean you have to do it exactly in the same way but the infrastructure is there.
Run extra CI via label -- We have opt-in label to run cron job CI in a PR but only as needed because the "exotic archs" jobs take a few hours to run. You can see it here: https://github.com/astropy/astropy/blob/b50e9cfc8bf862816ea9ddd7193f0cf801b44c71/.github/workflows/ci_cron_weekly.yml#L33

Hope this helps and happy holidays!

mattip · 2023-12-23T19:08:51Z

All the PyPy tests are slow, everything else finishes and the three PyPy tests will still be running

PyPy is known to be slow on c-extensions. I run weekly tests of PyPy HEAD against common or complicated projects' HEAD in a binary-testing. It probably makes sense for NumPy to limit testing of PyPy to a sampling strategy and not on every PR.

rgommers · 2023-12-24T10:44:47Z

Having an xslow would be fine. It would be nice to make sure that they are run on release wheels, but OK...

No need I think, these tests are highly unlikely to fail if they're still run in a regular CI job. So making that a manual step in the release process would be a bit much.

FWIW, I wouldn't mind such a setup, the one thing would be nice to auto-open an issue on failure I guess.

This type of improvement is certainly of interest as a CI improvement I'd say. It's a bit orthogonal to the main goals of this issue though, which are to speed up wheel builds to (a) reduce Cirrus CI costs, and (b) improve on iteration time when debugging CI issues.

Moving away from Azure completely falls also in the "desired CI improvements" bucket. The new BLAS CI jobs could bespecial-cased too, like SIMD, docs.

Wth pytest-durations-extra and --functions-durations, I get this:

nice, that's a helpful tool.

It probably makes sense for NumPy to limit testing of PyPy to a sampling strategy and not on every PR.

The main problem is PyPy wheel builds, not regular CI. I think we want to keep these wheels. We could look at not running the full test suite though, only the default (fast) tests.

andyfaff · 2023-12-24T11:19:54Z

Now that cp312 is mainstream we could probably build all the macos wheels in two matrix entries. cp39-cp312 for <14 (i.e. with openblas) and cp39-cp312 for 14>= (i.e. with accelerate). The CI runners have enough grunt to get through them all in under an hour. That would probably reduce cost, but wouldn't improve iteration speed while debugging.

To get through the CI builds quicker for linux_aarch64 we could give each matrix entry more CPU, currently they're only given 1 core each. However, if the CPU is underutilised then one doesn't get efficiency gains.

w.r.t debugging cirrus-ci - it should be possible to run all the jobs locally if one has a Mac. The cirrus CLI allows one to run the same config on your local computer. I was thinking of writing a guide for how to debug CI configs.

rgommers · 2023-12-24T11:30:18Z

Now that cp312 is mainstream we could probably build all the macos wheels in two matrix entries

Probably best not to. It won't make too much difference in overall runtime, just save a couple of minutes for avoiding repo clones and caching conda-forge downloads. Not worth the longer runtime and churn I'd say.

To get through the CI builds quicker for linux_aarch64 we could give each matrix entry more CPU, currently they're only given 1 core each. However, if the CPU is underutilised then one doesn't get efficiency gains

2 cpu's would help I suspect, build time will be almost twice faster, and test suite runtime ~1.7x or so with pytest-xdist.

I was thinking of writing a guide for how to debug CI configs.

That would be very useful I think. Also for other projects to refer to.

mhvk · 2023-12-24T13:29:22Z

w.r.t debugging cirrus-ci - it should be possible to run all the jobs locally if one has a Mac. The cirrus CLI allows one to run the same config on your local computer. I was thinking of writing a guide for how to debug CI configs.

Yes, please!

andyfaff · 2023-12-26T11:00:07Z

See https://github.com/numpy/numpy/wiki/Debugging-CI-guidelines for some basic guidelines for debugging CI configurations. Bear in mind it's a WIP. @rgommers, this should be good for both scipy and numpy.

rgommers added component: CI 17 - Task labels Dec 22, 2023

mhvk mentioned this issue Feb 25, 2024

TST: Mark slow tests in astropy/samp astropy/astropy#16095

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Doing something about slow tests again #25472

Doing something about slow tests again #25472

rgommers commented Dec 22, 2023

rgommers commented Dec 22, 2023

Uh oh!

charris commented Dec 22, 2023

Uh oh!

seberg commented Dec 22, 2023

Uh oh!

seberg commented Dec 22, 2023

Uh oh!

mhvk commented Dec 23, 2023

Uh oh!

seberg commented Dec 23, 2023

Uh oh!

mhvk commented Dec 23, 2023

Uh oh!

pllim commented Dec 23, 2023

Uh oh!

mattip commented Dec 23, 2023

Uh oh!

rgommers commented Dec 24, 2023

Uh oh!

andyfaff commented Dec 24, 2023

Uh oh!

rgommers commented Dec 24, 2023

Uh oh!

mhvk commented Dec 24, 2023

Uh oh!

andyfaff commented Dec 26, 2023

Uh oh!