Thanks to visit codestin.com
Credit goes to github.com

Skip to content

BUG: mixed object arrays and nan functions #8974

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pwolfram opened this issue Apr 21, 2017 · 12 comments · Fixed by #9013
Closed

BUG: mixed object arrays and nan functions #8974

pwolfram opened this issue Apr 21, 2017 · 12 comments · Fixed by #9013

Comments

@pwolfram
Copy link
Contributor

The following code produces non-intuitive behavior that I believe may be a bug:

In [1]: import numpy as np

In [2]: import datetime

In [3]: test = np.array([[datetime.datetime(1970, 7, 15, 0, 0), 364.0],
   ...:        [datetime.datetime(2017, 2, 5, 0, 0), np.nan]], dtype=object)

In [4]: np.nanmin(test[:,1])
/Users/pwolfram/anaconda/bin/ipython:1: RuntimeWarning: All-NaN axis encountered
  #!/Users/pwolfram/anaconda/bin/python
Out[4]: nan

In [5]: np.nanmax(test[:,1])
/Users/pwolfram/anaconda/bin/ipython:1: RuntimeWarning: All-NaN slice encountered
  #!/Users/pwolfram/anaconda/bin/python
Out[5]: nan

I would expected output from [4] and [5] to be 364.0, not nan. Other nan functions, e.g., nanmean also appear to be corrupted. Furthermore, the RuntimeWarning is fairly cryptic and doesn't appear to be correct.

Is there something I'm missing here? This is not at all what I would expect based on what boils down to a reducing function to a dimension that is effectively a numpy array that is not all nans as apparently interpreted by the RuntimeWarning.

@eric-wieser
Copy link
Member

eric-wieser commented Apr 21, 2017

On python 3, I get TypeError: unorderable types: datetime.datetime() <= float(). nanmean gives me a similar error.

Python 2 lets you compare anything to anything else, and get a totally random result, which is why you don't get an error

@eric-wieser
Copy link
Member

Apologies, I forgot the slicing. I can replicate this behaviour on 3.5

@eric-wieser
Copy link
Member

eric-wieser commented Apr 21, 2017

Geez, seems that nanmin is just completely broken on the simplest input:

>>> arr = np.array([np.nan, 1], dtype=np.float64)
>>> np.nanmin(arr)
nan

There's a comment in there that says # Fast, but not safe for subclasses of ndarray. Seem to me that it should read # Fast, but not safe ever...

Edit: not on master

@pwolfram
Copy link
Contributor Author

For the record, here is a somewhat hacky fix that is obviously unsavory for several reasons (fairly arbitrary cast, introduction of if, increased code):

def safe_nanfunc(data, nanfunc):
    if data.dtype == 'object':
        # cast to float
        data = np.ndarray.astype(data, dtype='f')
    return nanfunc(data)

@pwolfram
Copy link
Contributor Author

In [1]: import numpy as np

In [2]: np.array([np.nan, 1], dtype=np.float64)
Out[2]: array([ nan,   1.])

In [3]: np.nanmin(np.array([np.nan, 1], dtype=np.float64))
Out[3]: 1.0

In [4]: np.nanmax(np.array([np.nan, 1], dtype=np.float64))
Out[4]: 1.0

wasn't a problem for me. I'm on Python 2.7.13 via conda macOS and

In [5]: np.__version__
Out[5]: '1.12.1'

@pwolfram
Copy link
Contributor Author

It seems like unit tests should have caught this error, but regardless we have one above now! @eric-wieser if you agree this is a bug how long would it take to get this pushed into a minor version number if a fix can be found quickly?

@pwolfram pwolfram changed the title Mixed object arrays and nan functions (bug?) BUG: mixed object arrays and nan functions Apr 21, 2017
@eric-wieser
Copy link
Member

eric-wieser commented Apr 21, 2017

I've filed a more specific issue at #8975

I doubt that there'll be a new release of 1.12 in the immediate future, since 1.13 is supposedly right round the corner - and the https://github.com/numpy/numpy/milestone/48 milestone is pretty empty right now.

@pwolfram
Copy link
Contributor Author

@eric-wieser, I should note the speed to fix isn't a real problem but more of a curiosity for me. This is the type of thing I could likely fix but the unfortunate things is I'm under some severe time pressure right now and have a work around via safe_nanfunc for my current workflow.

@pwolfram
Copy link
Contributor Author

Thanks @eric-wieser for the hints on how to fix the issue. It sounds like this might be fast to fix but my numpy dev environment is quite stale now. What is typically the timeline here, e.g., the process to typically get this type of thing fixed? I can't work on this now but may be able to make some time over the next 60 days or so.

@eric-wieser
Copy link
Member

eric-wieser commented Apr 21, 2017

I've split off the fmin thing into a new issue (#8975)

There's a quick fix to the nanfunctions which would be to change nanmin and friends to take the slow path for object arrays with a simply dtype == object check - I'd imagine you could submit that PR directly through the github web interface in around half an hour

@eric-wieser
Copy link
Member

eric-wieser commented Apr 21, 2017

A related bug: #6209

@eric-wieser
Copy link
Member

@pwolfram: Numpy 1.13 will now have a workaround for this, although the underlying bugs of #9009 and #8975 are still there

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants