-
-
Notifications
You must be signed in to change notification settings - Fork 11k
BUG: Don't convert inputs to np.float64
in digitize
#11464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
||
|
||
def digitize(x, bins, right=False): | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docstring is copied verbatim
f99b441
to
5f3f47c
Compare
Also makes |
x = 2**54 # loses precision in a float | ||
assert_equal(np.digitize(x, [x - 1, x + 1]), 1) | ||
|
||
@dec.knownfailureif(True, "np.core.multiarray._monoticity loses precision") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs:
from numpy.testing import dec
I took a look at the source for knownfailureif
, and it is using nose
under the hood. I'd normally suggest using @pytest.mark.xfail
instead, but NumPy roadmap seems to imply that we want to avoid using "pytest magic" and mostly just use it as a runner (it seems Guido agrees, but if nose is unmaintained may need some thought). I'm not sure if that nose stuff behind dec
(and perhaps elsewhere?) will eventually have to be replaced though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NumPy roadmap seems to imply that we want to avoid using "pytest magic" and mostly just use it as a runner
Don't use anything with nose in it, we don't want the dependency, nose itself is unmaintained, and the the version up on pip is not python 3.7 compatible. Definitely use xfail
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We keep dec
for backwards compatibility with folks using the numpy testing framework, you will note that numpy itself no longer uses it anywhere except for testing testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guido has a point about the pytest "documentation by example", the voluminous documentation for pytest is very hard to use for reference and learning, but I expect that at some point someone will make a "real" reference :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unless there's an equally concise alternative, I don't see a problem with @pytest.mark.xfail
. The xfail
and skip
marks are kind of essential.
5f3f47c
to
d316777
Compare
Switched to use |
This converts digitize to a pure-python function that falls back on searchsorted. Performance doesn't really matter here anyway - if you care about performance, then you should just call searchsorted directly, rather than checking the order of the bins. Partially fixes numpygh-11022
Rebased on #11474 |
Thanks Eric. I'm guessing this has pretty much the same performance as before when the arrays have significant size. Might be good to have a benchmark at some point. |
Was there a downside to float64? |
Yes - conversion from uint64 to float64 is lossy, so digitize(uint64_array, uint64_bins) would produce incorrect results |
This converts digitize to a pure-python function that falls back on searchsorted.
Performance doesn't really matter here anyway - if you care about performance, then you should just call searchsorted directly, rather than checking the order of the bins.
Partially fixes gh-11022