ENH: Create boolean and integer ufuncs for isnan, isinf, and isfinite. #12988

qwhelan · 2019-02-19T03:47:10Z

For the following sample code:

import numpy as np
arr = np.full(10**5, 0, bool)

for i in range(10000):
    np.isnan(arr)

We get the following call graph from pprof:

This PR eliminates the trip through npy_half by providing specialized npy_bool implementations of isnan, isinf, and isfinite Given the limited values supported by npy_bool, we can trivially specify the result for all inputs. Doing so initially provided a ~20x speedup and now is closer to ~250x:

$ asv compare HEAD^ HEAD -s

Benchmarks that have improved:

       before           after         ratio
     [95db8c28]       [e861372b]
     <bool_ufunc~1>       <bool_ufunc>
-     1.23±0.01ms      5.03±0.09μs     0.00  bench_ufunc.IsNan.time_isnan('bool')
-        65.8±8μs       5.28±0.1μs     0.08  bench_ufunc.IsNan.time_isnan('int16')
-      87.9±0.7μs      5.32±0.05μs     0.06  bench_ufunc.IsNan.time_isnan('int32')
-         145±1μs      5.40±0.08μs     0.04  bench_ufunc.IsNan.time_isnan('int64')

Benchmarks that have stayed the same:

       before           after         ratio
     [95db8c28]       [e861372b]
     <bool_ufunc~1>       <bool_ufunc>
        119±0.6μs        118±0.4μs     1.00  bench_ufunc.IsNan.time_isnan('complex128')
          240±1μs          243±5μs     1.01  bench_ufunc.IsNan.time_isnan('complex256')
          106±1μs        105±0.3μs     0.99  bench_ufunc.IsNan.time_isnan('complex64')
         568±10μs          567±6μs     1.00  bench_ufunc.IsNan.time_isnan('float16')
         27.4±2μs      27.7±0.08μs     1.01  bench_ufunc.IsNan.time_isnan('float32')
       47.3±0.4μs       47.5±0.2μs     1.00  bench_ufunc.IsNan.time_isnan('float64')
          141±2μs        141±0.5μs     1.00  bench_ufunc.IsNan.time_isnan('longfloat')

The call graph is also greatly simplified:

eric-wieser · 2019-02-19T05:39:34Z

This has come up before I think, but in the wider context of also providing integer specializations. I don't remember if there were objections, or if the pr just stalled. I'll comment with links when I find the PR (s?) I'm thinking of

qwhelan · 2019-02-19T06:32:35Z

@eric-wieser Thanks, that would be appreciated if you're able to locate any prior discussions.

For additional context, here's a trivial example of how this manifests to a user (especially in a library like pandas):

import numpy as np

bools = np.full(10**6, 1, bool)
ints = np.full(10**6, 1, int)

%timeit np.isnan(bools)
6.33 ms ± 195 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit np.isnan(ints)
864 µs ± 28 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

So any user reliant on NaN-masking currently sees a ~7.3x speedup by using ints instead of bools for storing bool data, which encourages use of overly broad types for performance reasons.

eric-wieser · 2019-02-19T06:34:39Z

I'm not sure I understand how a user would nan-mask bool or int data. Either way, it surprises me that the ints are faster than the bools.

eric-wieser · 2019-02-19T06:51:13Z

Ok, this is the second time I've tried and failed to find that PR, so I think I'm going to give up.

You should be able to reuse your isnan, isfinite, and isinf loops for all the integer types.

If you want to extend this PR to that, one thing to watch out for is that these functions need to not return true on np.datetime64('nat') and np.timedelta64('nat').

If you want to avoid this pain, you could at least add loops for bBhHiI types, and stick a TODO in about adding it for lLqQ types.

qwhelan · 2019-02-19T06:53:20Z

@eric-wieser A common scenario would be using pandas, which adds a NaN-mask to most functions, including all() and any():

bools = pd.Series(np.full(10**6, 1, bool))
ints = pd.Series(np.full(10**6, 1, int))

%timeit bools.all()
7.63 ms ± 88.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit ints.all()
2.7 ms ± 20.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit np.all(bools.to_numpy())
51.4 µs ± 695 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit np.all(ints.to_numpy())
747 µs ± 46.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

So pretty painful overhead from NaN masking in the bool case.

My PR pandas-dev/pandas#25070 would try and fix the pandas side of things, but there's still some low-hanging fruit in the numpy case.

charris · 2019-02-19T15:17:19Z

Needs a release note.

numpy/core/src/umath/loops.c.src

eric-wieser · 2019-02-21T06:47:35Z

Thanks for adding the integer loops.

Can you check that the following passes?

@pytest.mark.parametrize('nat', [np.datetime64('nat'), np.timedelta64('nat')])
def test_nat_is_not_finite(self, nat):
    try:
        assert not np.isfinite(nat)
    except TypeError:
        pass  # ok, just not implemented

@pytest.mark.parametrize('nat', [np.datetime64('nat'), np.timedelta64('nat')])
def test_nat_is_nan(self, nat):
    try:
        assert np.isnan(nat)
    except TypeError:
        pass  # ok, just not implemented

@pytest.mark.parametrize('nat', [np.datetime64('nat'), np.timedelta64('nat')])
def test_nat_is_no_tinf(self, nat):
    try:
        assert not np.isinf(nat)
    except TypeError:
        pass  # ok, just not implemented

numpy/core/tests/test_ufunc.py

eric-wieser · 2019-02-21T07:53:15Z

I'll build this branch locally and play around with it. Perhaps the failure mode I'm expecting doesn't exist, which would be great!

eric-wieser · 2019-02-26T05:51:19Z

Needs a rebase - the new macros should move to fast_loop_macros.h

qwhelan · 2019-02-28T00:48:28Z

@eric-wieser Rebased and tests passing

doc/release/1.17.0-notes.rst

eric-wieser

Played around with this locally, and realized that only int -> time casting is allowed, not vice versa. This all looks great.

I do wonder if the benchmark is worth including though - will leave that decision to other reviewers.

Previously, boolean values would be routed through the half implementations of these functions, which added considerable overhead. Creating specialized ufuncs improves performance by ~250x Additionally, enable autovectorization of new isnan, isinf, and isfinite ufuncs.

qwhelan · 2019-03-10T08:18:28Z

@eric-wieser I've removed the benchmark from this PR.

eric-wieser · 2019-03-15T05:21:59Z

@charris, look good to merge?

qwhelan · 2019-03-28T23:42:45Z

@charris Are there any desired changes to this approach? I have several more patches in this vein that I've held off on submitting until this is merged.

eric-wieser · 2019-03-29T02:11:48Z

I'll go ahead and put this in - nothing here seems controversial

charris · 2019-03-29T02:53:16Z

numpy/core/src/umath/fast_loop_macros.h

+#define OUTPUT_LOOP_FAST(tout, op) \
+    do { \
+    /* condition allows compiler to optimize the generic macro */ \
+    if (IS_OUTPUT_CONT(tout)) { \


Should be indented.

I think this matches the existing macros, sadly

Please see #13208 - I have fixed the indentation for all macros in this file

charris · 2019-03-29T02:55:39Z

numpy/core/src/umath/fast_loop_macros.h

+    OUTPUT_LOOP { \
+        tout * out = (tout *)op1; \
+        op; \
+    }


Blank line between macros.

Fixed in #13208

charris · 2019-03-29T03:00:13Z

numpy/core/src/umath/fast_loop_macros.h

+ */
+#define BASE_OUTPUT_LOOP(tout, op) \
+    OUTPUT_LOOP { \
+        tout * out = (tout *)op1; \


Can omit the space after *.

Fixed in #13208

charris · 2019-03-29T03:07:33Z

numpy/core/src/umath/loops.c.src

+NPY_NO_EXPORT void
+BOOL_@kind@(char **args, npy_intp *dimensions, npy_intp *steps, void *NPY_UNUSED(func))
+{
+    OUTPUT_LOOP_FAST(npy_bool, *out = @val@);


Why not just pass the value and move the assignment to the macro where out is declared?

It was entirely to preserve calling convention to be similar to the other macros - I've implemented your suggestion in #13208

charris · 2019-03-29T03:13:08Z

Looks OK aside from some style/organization nits. The depth of the macro nesting makes it a bit hard to follow, but looks correct. The original behavior of promoting to float looks weird in truth, but no weirder than the functions being called on integer/boolean types :)

qwhelan · 2019-03-29T03:47:42Z

@charris Thanks for the comments and please see #13208 for implementation

Thanks @eric-wieser for merging!

charris changed the title ~~PERF: create boolean ufuncs for isnan, isinf, isfinite, and signbit.~~ ENH: Create boolean ufuncs for isnan, isinf, isfinite, and signbit. Feb 19, 2019

charris added 01 - Enhancement component: numpy._core 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes labels Feb 19, 2019

qwhelan force-pushed the bool_ufunc branch from 657831a to 1fdfcc6 Compare February 20, 2019 03:59

qwhelan changed the title ~~ENH: Create boolean ufuncs for isnan, isinf, isfinite, and signbit.~~ ENH: Create boolean and integer ufuncs for isnan, isinf, and isfinite. Feb 20, 2019

eric-wieser reviewed Feb 21, 2019

View reviewed changes

numpy/core/src/umath/loops.c.src Outdated Show resolved Hide resolved

eric-wieser reviewed Feb 21, 2019

View reviewed changes

numpy/core/src/umath/loops.c.src Outdated Show resolved Hide resolved

qwhelan force-pushed the bool_ufunc branch from 1fdfcc6 to 969ec74 Compare February 21, 2019 07:27

eric-wieser reviewed Feb 21, 2019

View reviewed changes

numpy/core/tests/test_ufunc.py Outdated Show resolved Hide resolved

qwhelan force-pushed the bool_ufunc branch from 969ec74 to acba9e0 Compare February 21, 2019 17:48

eric-wieser mentioned this pull request Feb 25, 2019

MAINT: Extract the loop macros into their own header #13032

Merged

eric-wieser removed the 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes label Feb 26, 2019

qwhelan force-pushed the bool_ufunc branch 4 times, most recently from f704308 to 7875f3c Compare February 28, 2019 00:10

eric-wieser mentioned this pull request Mar 5, 2019

Add isfinite support for datetime64 and timedelta64 #5610

Closed

eric-wieser reviewed Mar 5, 2019

View reviewed changes

doc/release/1.17.0-notes.rst Show resolved Hide resolved

eric-wieser reviewed Mar 5, 2019

View reviewed changes

qwhelan force-pushed the bool_ufunc branch from 7875f3c to 5785ca7 Compare March 10, 2019 07:42

eric-wieser approved these changes Mar 10, 2019

View reviewed changes

eric-wieser mentioned this pull request Mar 21, 2019

ENH: Added support for arrays with dtype=object to np.isinf, np.isnan, np.isfinite #10820

Closed

qwhelan mentioned this pull request Mar 22, 2019

PERF: O(n) speedup in any/all by re-enabling short-circuiting for bool case pandas-dev/pandas#25070

Merged

4 tasks

eric-wieser merged commit db5fcc8 into numpy:master Mar 29, 2019

charris reviewed Mar 29, 2019

View reviewed changes

qwhelan mentioned this pull request Mar 29, 2019

MAINT: cleanup of fast_loop_macros.h #13208

Merged

qwhelan deleted the bool_ufunc branch March 29, 2019 03:49

qwhelan mentioned this pull request Apr 10, 2019

Series.all much slower than Series.values.all pandas-dev/pandas#26032

Closed

anirudh2290 mentioned this pull request May 26, 2020

array_equal(a, b, equal_nan=True) throws errors for array with non-numeric values #16377

Open

rossbar mentioned this pull request Jul 17, 2020

isnan, isposinf and isneginf do not accept fractions.Fraction #15517

Open

Uh oh!

ENH: Create boolean and integer ufuncs for isnan, isinf, and isfinite. #12988

ENH: Create boolean and integer ufuncs for isnan, isinf, and isfinite. #12988

Uh oh!

Conversation

qwhelan commented Feb 19, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eric-wieser commented Feb 19, 2019

Uh oh!

qwhelan commented Feb 19, 2019

Uh oh!

eric-wieser commented Feb 19, 2019

Uh oh!

eric-wieser commented Feb 19, 2019

Uh oh!

qwhelan commented Feb 19, 2019

Uh oh!

charris commented Feb 19, 2019

Uh oh!

Uh oh!

Uh oh!

eric-wieser commented Feb 21, 2019

Uh oh!

Uh oh!

eric-wieser commented Feb 21, 2019

Uh oh!

eric-wieser commented Feb 26, 2019

Uh oh!

qwhelan commented Feb 28, 2019

Uh oh!

Uh oh!

eric-wieser left a comment

Choose a reason for hiding this comment

Uh oh!

qwhelan commented Mar 10, 2019

Uh oh!

eric-wieser commented Mar 15, 2019

Uh oh!

qwhelan commented Mar 28, 2019

Uh oh!

eric-wieser commented Mar 29, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

charris commented Mar 29, 2019

Uh oh!

qwhelan commented Mar 29, 2019

Uh oh!

Uh oh!

qwhelan commented Feb 19, 2019 •

edited

Loading