DOC: Document higher precision in mean/sum/add.reduce along fast axis due to pairwise sum. #9393

adler-j · 2017-07-09T10:43:19Z

As mentioned in this stackoverflow question, calling np.mean with the axis argument seems to introduce significantly lower numerical stability. In particular,

>>> import numpy as np
>>> X = np.random.rand(9999, 128, 128, 4).astype('float32')
>>> X.shape
>>> np.mean(X, axis=(0, 1, 2))
array([ 0.10241024,  0.10241024,  0.10241024,  0.10241024], dtype=float32)  # should be 0.5
>>> np.mean(X[:, :, :, 0])
0.50000387

whereas calling with full 64 bit precision gives correct values

>>> np.mean(X.astype('float64'), axis=(0, 1, 2))
array([ 0.50000323,  0.50004907,  0.50003198,  0.49999848])
>>> np.mean(X[:, :, :, 0].astype('float64'))
0.50000323305421812

Since the values are correct without the axis argument, the should likely be classified as a bug.

Edit:

This is on windows 7 with numpy 1.12.0, but it seems others are experiencing this issue as well.

The text was updated successfully, but these errors were encountered:

seberg · 2017-07-09T15:04:11Z

That is correct if your array is C-order (default) and you omit the last axis. It is due to the fact that we have basically better precision if the operation is along the fast axis in memory. However, since the fastest axis in memory is the last one, and you are not taking the sum/mean over that axis, you are not getting into the advantage of this higher precision. The reason for that is that there would be a significant speed penalty involved in general.

The higher precision is due to a pairwise summation, but as I said, you will only get it when the summation is along the fastest axis. It is unfortunate, but the only other option is to always provide the less precise version really. You are not getting worse precision in one case, you are getting better then naive precision in the other.

As for float32... naive summation of float32 will get to its limits pretty soon (basically at some point your intermediate sum is so large, adding to it does not change it anymore, at least not at enough precision), there is a reason for the typical preference of float64.

Another nice thing would be to provide more precise summation version, but its a different issue (you are welcome to chime in). Please feel free to continue discussion, but since this came up before, I will close the issue for now.

adler-j · 2017-07-09T15:08:06Z

Please feel free to continue discussion, but since this came up before, I will close the issue for now.

Would it not be reasonable to document this in the method, even if we keep the behavior as is?

Edit:

Another idea would be to somehow provide a way to compute the 64-bit precision sum of a 32-bit array.

seberg · 2017-07-09T15:10:33Z

Indeed, it would be good to mention it in the notes, can I persuade you of adding it? :)

adler-j · 2017-07-09T15:11:25Z

I'll see what I can do!

seberg · 2017-07-09T15:11:41Z

I suppose one thing that is a bit annoying is, that it is inherited from sum, so basically in principle any sum based method has this thing, so adding it to all things might be a bit much....

njsmith · 2017-07-09T16:35:17Z

sum and mean already cover a lot of the cases where someone is likely to run into this though, and documenting just those would at least give someone a fighting chance to figure out what's going on for other cases too :-). Which other cases are you thinking of that are likely to be affected?

seberg · 2017-07-09T16:44:25Z

Didn't think about it much, corr/cov come to mind as well, maybe some linalg stuff, but yes I agree sum/mean are probably good enough and reasonable, np.add as the ufunc behind sum should maybe also get a note (though it will only show for its reduce method).

kevinjos · 2017-07-11T04:18:25Z

Is the behavior described above also responsible for the following result?

>>> X = np.array(np.random.random((9999, 128, 128, 4)) * 1e5, dtype='float32')
>>> X.shape
(9999, 128, 128, 4)
>>> mean_by_axis = np.mean(X, axis=(0, 1, 2))
array([ 13423.11523438,  13423.11523438,  13423.11523438,  13423.11523438], dtype=float32)
>>> mean = np.mean(X[:, :, :, 0])
50001.297

In this case, I would expect to see a WARNING emitted after the call to np.mean(X, axis=(0, 1, 2).

seberg · 2017-07-11T08:22:55Z

@kevinjos, your example is identical to the one above (with a factor, which does not matter for floating point numbers). I will point out again that in the second case numpy happens to save you from trouble, it does not put you into trouble in the first case.

Providing specifically stable sums would be nice, but it is a different issue. And yes, it is annoying that numpy can save you in most cases, but a small change in code/data structure might "disable" it (heck we even hesitated adding this feature because of that).

I don't mind the idea as such to give warnings about likely precision problems, but I also think it is a separate issue (plus it might be unclear when exactly to warn/what the thresholds are to think about warning), and I frankly hope that any introductory course mentions that these things happen (I certainly do in mine). We could possibly warn about likely programming issues with low precision floating point numbers (and even for float64 I suppose), but there is a reason float32 is not the default in any language. It can be very imprecise unless you know what you are doing. I am also using it often (e.g. for processing large videos that have originally 16bit precision), but they are normalized, and if something went wrong badly, we would actually visually see the effect.

eric-wieser · 2017-07-11T09:43:15Z

The higher precision is due to a pairwise summation, but as I said, you will only get it when the summation is along the fastest axis.

Why is this the case? In both the examples given by @kevinjos, the last axis is already non-contiguous.

Does this become a speed vs precision trade-off? Could that trade-off be exposed to the end-user?

seberg · 2017-07-11T10:42:07Z

@eric-wieser it is the fastest axis vs. not the fastest axis. And no, there is no speed tradeoff, the extra precision comes at no cost when doing it along the fast axis, you would get a potentially big loss if you would force the wrong iteration order obviously.

ghost · 2017-10-04T18:44:44Z

Using array view can avoid the precision loss, is this right?

seberg · 2017-10-04T18:48:35Z

? not sure what you mean, views themself do not do anything. You can use np.mean(..., dtype=np.float64) to make the result and intermediat results float64 which if the data was float32, which may be good enough?

ghost · 2017-10-04T18:55:01Z

Yes, float64 is better for most cases. But I don’t know whether float64 will lead to a decrease in performance. In some special cases, for example, X(100000, 64, 64, 3) and axis=(0, 1, 2), will [np.mean(X[:,:,:,0]), np.mean(X[:,:,:,1]), np.mean(X[:,:,:,2])] be better?

seberg · 2017-10-04T21:15:01Z

You will have to time things, what is faster is a tradeoff between memory bandwidth, cache size (if arrays are not huge) and calculation speed. So it depends on the array size (small arrays casting may be slower, on the other hand, too small the python overhead may be larger). For large arrays, it is quite likely that casting comes at little cost, because it is memory bandwidth bound. Additionally all these factors may very much depend on the hardware used.

So, you will have to time it and try to make get an idea, not that it might not change at some point, though I doubt it here any time soon.

ghost · 2017-10-05T03:46:43Z

I will have a try, thank you.

mdickinson · 2017-10-27T14:03:33Z

An observation that might provide more motivation for documenting the effect: this algorithm discrepancy lies at the heart of some quite surprising (to someone who doesn't know what's going on) Pandas results:

>>> import numpy as np, pandas as pd
>>> x = np.random.randn(10**8).astype(np.float32)**2
>>> df = pd.DataFrame(dict(A=x, B=x))
>>> df.sum().A
72706424.0
>>> df.A.sum()
1.0000158e+08

Note that this behavour is of course inherited into `np.add.reduce` and many other reductions such as `mean` or users of this reduction, such as `cov`. This is ignored here. Closes numpygh-11331, numpygh-9393, numpygh-13734

mattip · 2019-06-11T05:56:32Z

Closed by #13737

cfreude · 2019-07-30T07:18:09Z

Sorry to resurrect this topic, but I ran across an issue, and I am not sure what's wrong.
I am basically computing the mean over a slice or axis of only 8 random values, and get different results even when using float64. (also happens for some other seeds not only 1)

import numpy as np
np.random.seed(1)
vals = np.random.rand(8, 3, 3).astype(np.float64)
print np.mean(vals[:, 2, 2], axis=0, dtype=np.float64) - np.mean(vals, axis=0, dtype=np.float64)[2, 2]
5.551115123125783e-17

os: Linux 64-bit (Linux Mint 19 Cinnamon - Version 3.8.9, Kernel : 4.15.0-50-generic)
conda version : 4.7.5
python version : 2.7.16.final.0 (build h9bab390_0)
numpy version : 1.16.4 (build py27h7e9f1db_0)

Any help is appreciated. :)

seberg · 2019-07-30T16:55:51Z

I am not sure how better to explain it. Floating point arithmetic on the computer is not associative, so the order in which operations are done matters. And numpy optimizes the order for speed.

Now there is an additional point that with one of the orders, numpy does a trick to be more precise, but in either case you have to expect numerical errors on that order.

seberg closed this as completed Jul 9, 2017

seberg reopened this Jul 9, 2017

seberg changed the title ~~Numerical precision of np.mean lacking with axis argument~~ DOC: Document higher precision in mean/sum/add.reduce along fast axis due to pairwise sum. Jul 9, 2017

seberg added the 04 - Documentation label Jul 9, 2017

seberg mentioned this issue Oct 4, 2017

np.mean() with axis produces wrong answer #9823

Closed

This was referenced Oct 8, 2018

StandardScaler obtains incorrect means for large np.float32 dtype datasets scikit-learn/scikit-learn#12333

Closed

[MRG] Increase mean precision for large float32 arrays scikit-learn/scikit-learn#12338

Merged

omasoud mentioned this issue Dec 8, 2018

Numpy mean fails/gives huge precision issues with large arrays and axis selection #11331

Closed

takluyver mentioned this issue Jun 7, 2019

sum result different when slicing before sum #13734

Closed

seberg mentioned this issue Jun 7, 2019

DOC: Mention and try to explain pairwise summation in sum #13737

Merged

mattip closed this as completed Jun 11, 2019

ogrisel mentioned this issue Apr 25, 2024

Make standard scaler compatible to Array API scikit-learn/scikit-learn#27113

Open

Uh oh!

DOC: Document higher precision in mean/sum/add.reduce along fast axis due to pairwise sum. #9393

DOC: Document higher precision in mean/sum/add.reduce along fast axis due to pairwise sum. #9393

Comments

adler-j commented Jul 9, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

seberg commented Jul 9, 2017

Uh oh!

adler-j commented Jul 9, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

seberg commented Jul 9, 2017

Uh oh!

adler-j commented Jul 9, 2017

Uh oh!

seberg commented Jul 9, 2017

Uh oh!

njsmith commented Jul 9, 2017

Uh oh!

seberg commented Jul 9, 2017

Uh oh!

kevinjos commented Jul 11, 2017

Uh oh!

seberg commented Jul 11, 2017

Uh oh!

eric-wieser commented Jul 11, 2017

Uh oh!

seberg commented Jul 11, 2017

Uh oh!

ghost commented Oct 4, 2017

Uh oh!

seberg commented Oct 4, 2017

Uh oh!

ghost commented Oct 4, 2017

Uh oh!

seberg commented Oct 4, 2017

Uh oh!

ghost commented Oct 5, 2017

Uh oh!

mdickinson commented Oct 27, 2017

Uh oh!

mattip commented Jun 11, 2019

Uh oh!

cfreude commented Jul 30, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

seberg commented Jul 30, 2019

Uh oh!

adler-j commented Jul 9, 2017 •

edited

Loading

adler-j commented Jul 9, 2017 •

edited

Loading

cfreude commented Jul 30, 2019 •

edited

Loading