Thanks to visit codestin.com
Credit goes to github.com

Skip to content

DOC: Document higher precision in mean/sum/add.reduce along fast axis due to pairwise sum. #9393

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
adler-j opened this issue Jul 9, 2017 · 20 comments

Comments

@adler-j
Copy link

adler-j commented Jul 9, 2017

As mentioned in this stackoverflow question, calling np.mean with the axis argument seems to introduce significantly lower numerical stability. In particular,

>>> import numpy as np
>>> X = np.random.rand(9999, 128, 128, 4).astype('float32')
>>> X.shape
>>> np.mean(X, axis=(0, 1, 2))
array([ 0.10241024,  0.10241024,  0.10241024,  0.10241024], dtype=float32)  # should be 0.5
>>> np.mean(X[:, :, :, 0])
0.50000387

whereas calling with full 64 bit precision gives correct values

>>> np.mean(X.astype('float64'), axis=(0, 1, 2))
array([ 0.50000323,  0.50004907,  0.50003198,  0.49999848])
>>> np.mean(X[:, :, :, 0].astype('float64'))
0.50000323305421812

Since the values are correct without the axis argument, the should likely be classified as a bug.

Edit:

This is on windows 7 with numpy 1.12.0, but it seems others are experiencing this issue as well.

@seberg
Copy link
Member

seberg commented Jul 9, 2017

That is correct if your array is C-order (default) and you omit the last axis. It is due to the fact that we have basically better precision if the operation is along the fast axis in memory. However, since the fastest axis in memory is the last one, and you are not taking the sum/mean over that axis, you are not getting into the advantage of this higher precision. The reason for that is that there would be a significant speed penalty involved in general.

The higher precision is due to a pairwise summation, but as I said, you will only get it when the summation is along the fastest axis. It is unfortunate, but the only other option is to always provide the less precise version really. You are not getting worse precision in one case, you are getting better then naive precision in the other.

As for float32... naive summation of float32 will get to its limits pretty soon (basically at some point your intermediate sum is so large, adding to it does not change it anymore, at least not at enough precision), there is a reason for the typical preference of float64.

Another nice thing would be to provide more precise summation version, but its a different issue (you are welcome to chime in). Please feel free to continue discussion, but since this came up before, I will close the issue for now.

@seberg seberg closed this as completed Jul 9, 2017
@adler-j
Copy link
Author

adler-j commented Jul 9, 2017

Please feel free to continue discussion, but since this came up before, I will close the issue for now.

Would it not be reasonable to document this in the method, even if we keep the behavior as is?

Edit:

Another idea would be to somehow provide a way to compute the 64-bit precision sum of a 32-bit array.

@seberg
Copy link
Member

seberg commented Jul 9, 2017

Indeed, it would be good to mention it in the notes, can I persuade you of adding it? :)

@adler-j
Copy link
Author

adler-j commented Jul 9, 2017

I'll see what I can do!

@seberg
Copy link
Member

seberg commented Jul 9, 2017

I suppose one thing that is a bit annoying is, that it is inherited from sum, so basically in principle any sum based method has this thing, so adding it to all things might be a bit much....

@njsmith
Copy link
Member

njsmith commented Jul 9, 2017

sum and mean already cover a lot of the cases where someone is likely to run into this though, and documenting just those would at least give someone a fighting chance to figure out what's going on for other cases too :-). Which other cases are you thinking of that are likely to be affected?

@seberg
Copy link
Member

seberg commented Jul 9, 2017

Didn't think about it much, corr/cov come to mind as well, maybe some linalg stuff, but yes I agree sum/mean are probably good enough and reasonable, np.add as the ufunc behind sum should maybe also get a note (though it will only show for its reduce method).

@seberg seberg reopened this Jul 9, 2017
@seberg seberg changed the title Numerical precision of np.mean lacking with axis argument DOC: Document higher precision in mean/sum/add.reduce along fast axis due to pairwise sum. Jul 9, 2017
@kevinjos
Copy link

Is the behavior described above also responsible for the following result?

>>> X = np.array(np.random.random((9999, 128, 128, 4)) * 1e5, dtype='float32')
>>> X.shape
(9999, 128, 128, 4)
>>> mean_by_axis = np.mean(X, axis=(0, 1, 2))
array([ 13423.11523438,  13423.11523438,  13423.11523438,  13423.11523438], dtype=float32)
>>> mean = np.mean(X[:, :, :, 0])
50001.297

In this case, I would expect to see a WARNING emitted after the call to np.mean(X, axis=(0, 1, 2).

@seberg
Copy link
Member

seberg commented Jul 11, 2017

@kevinjos, your example is identical to the one above (with a factor, which does not matter for floating point numbers). I will point out again that in the second case numpy happens to save you from trouble, it does not put you into trouble in the first case.

Providing specifically stable sums would be nice, but it is a different issue. And yes, it is annoying that numpy can save you in most cases, but a small change in code/data structure might "disable" it (heck we even hesitated adding this feature because of that).

I don't mind the idea as such to give warnings about likely precision problems, but I also think it is a separate issue (plus it might be unclear when exactly to warn/what the thresholds are to think about warning), and I frankly hope that any introductory course mentions that these things happen (I certainly do in mine). We could possibly warn about likely programming issues with low precision floating point numbers (and even for float64 I suppose), but there is a reason float32 is not the default in any language. It can be very imprecise unless you know what you are doing. I am also using it often (e.g. for processing large videos that have originally 16bit precision), but they are normalized, and if something went wrong badly, we would actually visually see the effect.

@eric-wieser
Copy link
Member

The higher precision is due to a pairwise summation, but as I said, you will only get it when the summation is along the fastest axis.

Why is this the case? In both the examples given by @kevinjos, the last axis is already non-contiguous.

Does this become a speed vs precision trade-off? Could that trade-off be exposed to the end-user?

@seberg
Copy link
Member

seberg commented Jul 11, 2017

@eric-wieser it is the fastest axis vs. not the fastest axis. And no, there is no speed tradeoff, the extra precision comes at no cost when doing it along the fast axis, you would get a potentially big loss if you would force the wrong iteration order obviously.

@ghost
Copy link

ghost commented Oct 4, 2017

Using array view can avoid the precision loss, is this right?

@seberg
Copy link
Member

seberg commented Oct 4, 2017

? not sure what you mean, views themself do not do anything. You can use np.mean(..., dtype=np.float64) to make the result and intermediat results float64 which if the data was float32, which may be good enough?

@ghost
Copy link

ghost commented Oct 4, 2017

Yes, float64 is better for most cases. But I don’t know whether float64 will lead to a decrease in performance. In some special cases, for example, X(100000, 64, 64, 3) and axis=(0, 1, 2), will [np.mean(X[:,:,:,0]), np.mean(X[:,:,:,1]), np.mean(X[:,:,:,2])] be better?

@seberg
Copy link
Member

seberg commented Oct 4, 2017

You will have to time things, what is faster is a tradeoff between memory bandwidth, cache size (if arrays are not huge) and calculation speed. So it depends on the array size (small arrays casting may be slower, on the other hand, too small the python overhead may be larger). For large arrays, it is quite likely that casting comes at little cost, because it is memory bandwidth bound. Additionally all these factors may very much depend on the hardware used.

So, you will have to time it and try to make get an idea, not that it might not change at some point, though I doubt it here any time soon.

@ghost
Copy link

ghost commented Oct 5, 2017

I will have a try, thank you.

@mdickinson
Copy link
Contributor

An observation that might provide more motivation for documenting the effect: this algorithm discrepancy lies at the heart of some quite surprising (to someone who doesn't know what's going on) Pandas results:

>>> import numpy as np, pandas as pd
>>> x = np.random.randn(10**8).astype(np.float32)**2
>>> df = pd.DataFrame(dict(A=x, B=x))
>>> df.sum().A
72706424.0
>>> df.A.sum()
1.0000158e+08

@mattip
Copy link
Member

mattip commented Jun 11, 2019

Closed by #13737

@mattip mattip closed this as completed Jun 11, 2019
@cfreude
Copy link

cfreude commented Jul 30, 2019

Sorry to resurrect this topic, but I ran across an issue, and I am not sure what's wrong.
I am basically computing the mean over a slice or axis of only 8 random values, and get different results even when using float64. (also happens for some other seeds not only 1)

import numpy as np
np.random.seed(1)
vals = np.random.rand(8, 3, 3).astype(np.float64)
print np.mean(vals[:, 2, 2], axis=0, dtype=np.float64) - np.mean(vals, axis=0, dtype=np.float64)[2, 2]
5.551115123125783e-17

os: Linux 64-bit (Linux Mint 19 Cinnamon - Version 3.8.9, Kernel : 4.15.0-50-generic)
conda version : 4.7.5
python version : 2.7.16.final.0 (build h9bab390_0)
numpy version : 1.16.4 (build py27h7e9f1db_0)

Any help is appreciated. :)

@seberg
Copy link
Member

seberg commented Jul 30, 2019

I am not sure how better to explain it. Floating point arithmetic on the computer is not associative, so the order in which operations are done matters. And numpy optimizes the order for speed.

Now there is an additional point that with one of the orders, numpy does a trick to be more precise, but in either case you have to expect numerical errors on that order.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants