BUG: Don't convert inputs to `np.float64` in digitize #11464

eric-wieser · 2018-06-30T07:16:28Z

This converts digitize to a pure-python function that falls back on searchsorted.

Performance doesn't really matter here anyway - if you care about performance, then you should just call searchsorted directly, rather than checking the order of the bins.

Partially fixes gh-11022

eric-wieser · 2018-06-30T07:16:55Z

numpy/lib/function_base.py

+
+
+def digitize(x, bins, right=False):
+    """


Docstring is copied verbatim

eric-wieser · 2018-06-30T07:22:32Z

Also makes np.digitize(x, []) == np.zeros_like(x), rather than erroring, which seems correct to me

tylerjereddy · 2018-06-30T14:35:35Z

numpy/lib/tests/test_function_base.py

+        x = 2**54  # loses precision in a float
+        assert_equal(np.digitize(x, [x - 1, x + 1]), 1)
+
+    @dec.knownfailureif(True, "np.core.multiarray._monoticity loses precision")


This needs:

from numpy.testing import dec

I took a look at the source for knownfailureif, and it is using nose under the hood. I'd normally suggest using @pytest.mark.xfail instead, but NumPy roadmap seems to imply that we want to avoid using "pytest magic" and mostly just use it as a runner (it seems Guido agrees, but if nose is unmaintained may need some thought). I'm not sure if that nose stuff behind dec (and perhaps elsewhere?) will eventually have to be replaced though.

NumPy roadmap seems to imply that we want to avoid using "pytest magic" and mostly just use it as a runner

Don't use anything with nose in it, we don't want the dependency, nose itself is unmaintained, and the the version up on pip is not python 3.7 compatible. Definitely use xfail instead.

We keep dec for backwards compatibility with folks using the numpy testing framework, you will note that numpy itself no longer uses it anywhere except for testing testing.

Guido has a point about the pytest "documentation by example", the voluminous documentation for pytest is very hard to use for reference and learning, but I expect that at some point someone will make a "real" reference :)

unless there's an equally concise alternative, I don't see a problem with @pytest.mark.xfail. The xfail and skip marks are kind of essential.

eric-wieser · 2018-06-30T18:36:49Z

Switched to use xfail - I didn't check to see if the dec.knownFailureIf example was within the meta-test stuff

This converts digitize to a pure-python function that falls back on searchsorted. Performance doesn't really matter here anyway - if you care about performance, then you should just call searchsorted directly, rather than checking the order of the bins. Partially fixes numpygh-11022

eric-wieser · 2018-07-06T19:19:06Z

Rebased on #11474

charris · 2018-07-08T22:04:29Z

Thanks Eric. I'm guessing this has pretty much the same performance as before when the arrays have significant size. Might be good to have a benchmark at some point.

charris · 2018-07-08T22:05:26Z

Was there a downside to float64?

eric-wieser · 2018-07-08T22:16:35Z

Yes - conversion from uint64 to float64 is lossy, so digitize(uint64_array, uint64_bins) would produce incorrect results

eric-wieser added 00 - Bug component: numpy._core labels Jun 30, 2018

eric-wieser commented Jun 30, 2018

View reviewed changes

numpy/lib/function_base.py

def digitize(x, bins, right=False):

"""

Copy link

Member Author

eric-wieser Jun 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstring is copied verbatim

eric-wieser force-pushed the monotonicity branch from f99b441 to 5f3f47c Compare June 30, 2018 07:19

tylerjereddy reviewed Jun 30, 2018

View reviewed changes

eric-wieser force-pushed the monotonicity branch from 5f3f47c to d316777 Compare June 30, 2018 18:31

eric-wieser force-pushed the monotonicity branch from d316777 to 307dd76 Compare July 6, 2018 19:18

charris merged commit 7cd94f2 into numpy:master Jul 8, 2018

eric-wieser mentioned this pull request Jul 8, 2018

BUG: np.digitize casts integers to float64 #11022

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: Don't convert inputs to `np.float64` in digitize #11464

BUG: Don't convert inputs to `np.float64` in digitize #11464

Uh oh!

eric-wieser commented Jun 30, 2018

Uh oh!

eric-wieser Jun 30, 2018

Uh oh!

eric-wieser commented Jun 30, 2018

Uh oh!

tylerjereddy Jun 30, 2018

Uh oh!

charris Jun 30, 2018

Uh oh!

charris Jun 30, 2018

Uh oh!

charris Jun 30, 2018

Uh oh!

rgommers Jun 30, 2018

Uh oh!

eric-wieser commented Jun 30, 2018

Uh oh!

eric-wieser commented Jul 6, 2018

Uh oh!

charris commented Jul 8, 2018

Uh oh!

charris commented Jul 8, 2018

Uh oh!

eric-wieser commented Jul 8, 2018

Uh oh!

Uh oh!

Uh oh!

BUG: Don't convert inputs to np.float64 in digitize #11464

BUG: Don't convert inputs to np.float64 in digitize #11464

Uh oh!

Conversation

eric-wieser commented Jun 30, 2018

Uh oh!

eric-wieser Jun 30, 2018

Choose a reason for hiding this comment

Uh oh!

eric-wieser commented Jun 30, 2018

Uh oh!

tylerjereddy Jun 30, 2018

Choose a reason for hiding this comment

Uh oh!

charris Jun 30, 2018

Choose a reason for hiding this comment

Uh oh!

charris Jun 30, 2018

Choose a reason for hiding this comment

Uh oh!

charris Jun 30, 2018

Choose a reason for hiding this comment

Uh oh!

rgommers Jun 30, 2018

Choose a reason for hiding this comment

Uh oh!

eric-wieser commented Jun 30, 2018

Uh oh!

eric-wieser commented Jul 6, 2018

Uh oh!

charris commented Jul 8, 2018

Uh oh!

charris commented Jul 8, 2018

Uh oh!

eric-wieser commented Jul 8, 2018

Uh oh!

Uh oh!

BUG: Don't convert inputs to `np.float64` in digitize #11464

BUG: Don't convert inputs to `np.float64` in digitize #11464