Thanks to visit codestin.com
Credit goes to github.com

Skip to content

BUG rounding error in divmod #6127

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mhvk opened this issue Jul 28, 2015 · 29 comments
Closed

BUG rounding error in divmod #6127

mhvk opened this issue Jul 28, 2015 · 29 comments

Comments

@mhvk
Copy link
Contributor

mhvk commented Jul 28, 2015

The ndarray implementation of divmod seems to do some incorrect rounding:

# regular python
divmod(78*6e-8, 6e-8)
# (77.0, 5.999999999999965e-08)
# makes sense; cannot represent number precisely as float
import numpy as np
np.__version__
# '1.10.0.dev0+00f4fae'
divmod(np.array(78*6e-8), 6e-8)
# (78.0, 5.9999999999999651e-08)
# Oops!!
divmod(np.arange(77, 80)*6e-8, 6e-8)
# (array([ 77.,  78.,  79.]),
#   array([  2.24993127e-22,   6.00000000e-08,   6.00000000e-08]))
@mhvk
Copy link
Contributor Author

mhvk commented Jul 28, 2015

p.s. Bug initially found by @MatthewQuenneville

@mhvk
Copy link
Contributor Author

mhvk commented Jul 29, 2015

A bit further checking shows that divmod just calls floor_divide and remainder separately
(https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/number.c#L781), and that floor_divide gives a different answer from python's version:

78*6e-8 // 6e-8
# 77.0
np.floor_divide(78*6e-8, 6e-8)
# 78.0

Of course, what is "wrong" is a bit in the eye of the beholder:

78*6e-8 / 6e-8 == 78
# True

In any case, at least the divmod should give a remainder that is consistent with the fraction. I fear one either has to write a new inner loop or ensure the remainder is calculated using the fraction (or vice versa).

A possible cause of the problem may be a check on negative remainders done in https://github.com/numpy/numpy/blob/master/numpy/core/src/umath/loops.c.src#L1004: the remainder is increased by one unit, but the fraction would not be correspondingly decreased.

@pitrou
Copy link
Member

pitrou commented Sep 15, 2015

Given that:

>>> (78*6e-8) / 6e-8
78.0

It seems that Python's floor division is a bit problematic:

>>> (78*6e-8) // 6e-8
77.0

@pitrou
Copy link
Member

pitrou commented Sep 15, 2015

For reference, I reported a Python issue at http://bugs.python.org/issue25129

@mhvk
Copy link
Contributor Author

mhvk commented Sep 15, 2015

Not really, that could simply be a floating point rounding error, and the remainder is consistent with that.

All that matters is that one can add back, and this is fine for python:

q, r = divmod(78*6e-8, 6e-8)
q, r
# (77.0, 5.999999999999965e-08)
(q * 6e-8 + r) / 6e-8
# 78.0

In contrast, with a numpy array:

q, r = divmod(np.array(78*6e-8), 6e-8)
(q * 6e-8 + r) / 6e-8
# OOPS:  78.999999999999986

@charris
Copy link
Member

charris commented Sep 15, 2015

Looks to me like it depends on the details of the computation

In [6]: '%25.18e' % ((78 * 6e-8)/6e-8)
Out[6]: ' 7.800000000000000000e+01'

In [7]:  a = (78 * 6e-8)

In [8]:  a / 6e8
Out[8]: 7.799999999999999e-15

@charris
Copy link
Member

charris commented Sep 15, 2015

nvm. Yes, it looks like numpy gets the remainder wrong.

@mhvk
Copy link
Contributor Author

mhvk commented Sep 15, 2015

nvm. Yes, it looks like numpy gets the remainder wrong.

I think the problem is that the two are not calculated together, so that the adjustment in the remainder is not propagated to the quotient.

@pitrou
Copy link
Member

pitrou commented Sep 15, 2015

Indeed, CPython internally has a unique divmod function that gets called by all three operations: divmod(), floor division and remainder.

@argriffing
Copy link
Contributor

Consider a hypothetical floor_sub(78, 1e-15). Would you want it to give 78 because 78 - 1e-15 == 78 in double precision or would you want it to give the more mathematically correct answer 77?

@mhvk
Copy link
Contributor Author

mhvk commented Sep 15, 2015

@argriffing - I know one cannot always avoid precision/rounding errors; however, the remainder and floor division should be consistent with each other, in that if q, r = divmod(a, b), one should have q*b+r = a, at least approximately; this is violated in the numpy example I showed.

@argriffing
Copy link
Contributor

@mhvk Yes, I was responding to @pitrou's suggestion that Python's floor division is problematic. Sorry that wasn't clear in my comment! I agree that numpy has a divmod bug.

@anntzer
Copy link
Contributor

anntzer commented Dec 30, 2015

Here's a case that's perhaps less contrieved, only using "non-small" numbers:

In [3]: divmod(np.float64(1.0), 0.2)
Out[3]: (5.0, 0.19999999999999996) # oops

@charris charris added this to the 1.11.0 release milestone Dec 30, 2015
anntzer added a commit to anntzer/matplotlib that referenced this issue Dec 31, 2015
plt.plot([-.1, .2]) used to pick (in round numbers mode) [-.1, .25] as
ylims due to floating point inaccuracies; change it to pick [-.1, .2]
(up to floating point inaccuracies).

Note that this requires working around a bug in numpy's implementation
of divmod (numpy/numpy#6127).

Many test images have changed!

See matplotlib#5767.
anntzer added a commit to anntzer/matplotlib that referenced this issue Dec 31, 2015
plt.plot([-.1, .2]) used to pick (in round numbers mode) [-.1, .25] as
ylims due to floating point inaccuracies; change it to pick [-.1, .2]
(up to floating point inaccuracies).

Note that this requires working around a bug in numpy's implementation
of divmod (numpy/numpy#6127).

Many test images have changed!

See matplotlib#5767.

Probably also wraps up work on matplotlib#5738, but tests are missing.
anntzer added a commit to anntzer/matplotlib that referenced this issue Dec 31, 2015
plt.plot([-.1, .2]) used to pick (in round numbers mode) [-.1, .25] as
ylims due to floating point inaccuracies; change it to pick [-.1, .2]
(up to floating point inaccuracies).

Note that this requires working around a bug in numpy's implementation
of divmod (numpy/numpy#6127).

Many test images have changed!

See matplotlib#5767.

Probably also wraps up work on matplotlib#5738, but tests are missing.
anntzer added a commit to anntzer/matplotlib that referenced this issue Dec 31, 2015
plt.plot([-.1, .2]) used to pick (in round numbers mode) [-.1, .25] as
ylims due to floating point inaccuracies; change it to pick [-.1, .2]
(up to floating point inaccuracies).

Note that this requires working around a bug in numpy's implementation
of divmod (numpy/numpy#6127).

Many test images have changed!

See matplotlib#5767.

Also some more progress on matplotlib#5738.
anntzer added a commit to anntzer/matplotlib that referenced this issue Jan 1, 2016
plt.plot([-.1, .2]) used to pick (in round numbers mode) [-.1, .25] as
ylims due to floating point inaccuracies; change it to pick [-.1, .2]
(up to floating point inaccuracies).

Support for the (unused and deprecated-in-comment) "trim" keyword
argument has been dropped as the way ticks are picked has changed.

Many test images have changed!

Implementation notes:
- A bug in numpy's implementation of divmod (numpy/numpy#6127) is worked
around.
- The implementation of scale_range has also been cleaned.

See matplotlib#5767, matplotlib#5738.
anntzer added a commit to anntzer/matplotlib that referenced this issue Jan 1, 2016
plt.plot([-.1, .2]) used to pick (in round numbers mode) [-.1, .25] as
ylims due to floating point inaccuracies; change it to pick [-.1, .2]
(up to floating point inaccuracies).

Support for the (unused and deprecated-in-comment) "trim" keyword
argument has been dropped as the way ticks are picked has changed.

Many test images have changed!

Implementation notes:
- A bug in numpy's implementation of divmod (numpy/numpy#6127) is worked
around.
- The implementation of scale_range has also been cleaned.

See matplotlib#5767, matplotlib#5738.
anntzer added a commit to anntzer/matplotlib that referenced this issue Jan 2, 2016
plt.plot([-.1, .2]) used to pick (in round numbers mode) [-.1, .25] as
ylims due to floating point inaccuracies; change it to pick [-.1, .2]
(up to floating point inaccuracies).

Support for the (unused and deprecated-in-comment) "trim" keyword
argument has been dropped as the way ticks are picked has changed.

Many test images have changed!

Implementation notes:
- A bug in numpy's implementation of divmod (numpy/numpy#6127) is worked
around.
- The implementation of scale_range has also been cleaned.

See matplotlib#5767, matplotlib#5738.
@charris
Copy link
Member

charris commented Jan 12, 2016

Well, I can fix the Numpy end of this, but np.float64 (scalar) inherits from Python and numpy divmod never gets called. In Python 2.7:

In [1]: divmod(np.float32(78*6e-8), np.float64(6e-8))
Hi from divmod
Out[1]: (78.0, 7.4586133303649119e-14)

In [2]: divmod(np.float64(78*6e-8), np.float32(6e-8))
Out[2]: (78.0, 1.6699839685478494e-13)

In [3]: divmod(np.float64(78*6e-8), np.float64(6e-8))
Out[3]: (78.0, 5.9999999999999651e-08)

In [4]: divmod(np.float32(78*6e-8), np.float64(6e-8))
Hi from divmod
Out[4]: (78.0, 7.4586133303649119e-14)

In [5]: divmod(78*6e-8, 6e-8)
Out[5]: (77.0, 5.999999999999965e-08)

I think Python is doing something inconsistent here.

@anntzer
Copy link
Contributor

anntzer commented Jan 13, 2016

There are two places for overriding divmod: scalamath.c.src (@type@_ctype_divmod) and number.c (array_divmod). I haven't looked in depth but from quick tests (where / really means divmod):

  • 64/64 calls double_ctype_divmod
  • 64/32 calls double_ctype_divmod
  • 32/64 calls array_divmod
  • 32/32 calls float_ctype_divmod
  • pythonfloat/64 and 64/pythonfloat call double_ctype_divmod
  • pythonfloat/32 and 32/pythonfloat call array_divmod

so at least everything is overridable.

@charris
Copy link
Member

charris commented Jan 13, 2016

scalarmath calls array_divmod under the covers.

oops, scalartypes.c.src, not scalarmath calls array_divmod.

@anntzer
Copy link
Contributor

anntzer commented Jan 13, 2016

I don't know what you exactly mean, but inserting printfs at the entry of all of these functions only results in the given functions being logged; e.g. float64/float64 doesn't seem to call array_divmod at any point.

@charris
Copy link
Member

charris commented Jan 13, 2016

Oh, that is ugly. The umath module initializes the scalar type function tables when it loads. I knew it passed a number of function to a function table for the arrays, so I guess I should have expected it. I wonder if this would have a effect on the proposed uses of __numpy_ufunc__?

@mhvk
Copy link
Contributor Author

mhvk commented Jan 13, 2016

The python solution of always calling the same function for all of floor_div, remainder, and divmod seems to make the most sense. Presumably one can pass on some flags that tell which result(s) one is actually interested in.

@charris
Copy link
Member

charris commented Jan 15, 2016

The problem currently is that numpy uses C floor for a//b and C % for the remainder. Apparently % isn't completely consistent with floor. Probably the easiest way to get this (almost) consistent is to redefine a%b as a - b*floor(a/b) in the computations.

@anntzer
Copy link
Contributor

anntzer commented Jan 15, 2016

Indeed, it is amazing that

int main() {
    double a = 1., b = .2;
    printf("%.10f %.10f", floor(a / b), fmod(a, b));
    return 0;
}

prints 5.0000000000 0.2000000000 even though fmod and floor are defined as "The floating-point remainder of the division operation x/y calculated by this function is exactly the value x - n*y, where n is x/y with its fractional part truncated. " and "Computes the largest integer value not greater than arg."

PS: Confirmed with glibc 2.22/gcc 5.3.0 (Linux) and MSVC2015 (Windows).

@charris
Copy link
Member

charris commented Jan 15, 2016

Yep, fmod doesn't use floor, but rather float truncation. Fixing the remainder alone doesn't work either as rounding error is still a factor. For instance

In [3]: divmod(array(78*6e-8), 6e-8)
Out[3]: (78.0, -3.4410713482205951e-22)

which is incorrect because the remainder is negative but should have the same sign as the divisor (python definition). So it seems that the only safe way to have // consistent with % is to compute them together.

@anntzer
Copy link
Contributor

anntzer commented Jan 15, 2016

I don't think that's the difference (everything is positive), using trunc here (in C) gives the same result.

@charris
Copy link
Member

charris commented Jan 15, 2016

A fix is to express the remainder in this form.
rem = b * (a/b - floor(a/b)). The difference d in parenthesis is always 0 <= d < 1
For the problem at hand

In [3]: divmod(array(78*6e-8), array(6e-8))
Out[3]: (78.0, 0.0)

In [4]: divmod(float64(78*6e-8), float64(6e-8))
Out[4]: (78.0, 0.0)

In [5]: divmod(float64(78*6e-8), float32(6e-8))
Out[5]: (78.0, 1.6699839677396904e-13)

In [6]: divmod(float32(78*6e-8), float64(6e-8))
Out[6]: (78.0, 7.4586133109733047e-14)

In [7]: divmod(float32(78*6e-8), float32(6e-8))
Out[7]: (78.0, 4.5776366e-13)

Which looks good enough.

@mhvk
Copy link
Contributor Author

mhvk commented Jan 15, 2016

Agreed that the two need to go through the same path.

@charris
Copy link
Member

charris commented Jan 15, 2016

I think we can get around the same path problem as long as division and floor are repeatable. Which may not always be the case on x86, but...

charris added a commit to charris/numpy that referenced this issue Jan 15, 2016
This is apropos numpy#6127. The fix is to make the functions floor_division
and remainder consistent, i.e.,

    b * floor_division(a, b) + remainder(a, b) == a

Previous to this fix remainder was computed a the C level using the '%'
operator, and the result was not always consistent with the floor
function. The current approach is to compute the remainder using

    b * (a/b - floor(a/b))

which is both consistent with the Python '%' operator and numerically
consistent with floor_division implemented using the floor function.

Closes numpy#6127.
@mhvk
Copy link
Contributor Author

mhvk commented Jan 17, 2016

Happy to see this resolved; solution looks good!

@charris
Copy link
Member

charris commented Jan 17, 2016

@mhvk, I think it is actually better than Python at the moment. However, at some point it will hit the precision limits of the floats...

@pitrou
Copy link
Member

pitrou commented Feb 10, 2016

There seems to be a regression now: #7224

anntzer added a commit to anntzer/matplotlib that referenced this issue Feb 15, 2016
plt.plot([-.1, .2]) used to pick (in round numbers mode) [-.1, .25] as
ylims due to floating point inaccuracies; change it to pick [-.1, .2]
(up to floating point inaccuracies).

Support for the (unused and deprecated-in-comment) "trim" keyword
argument has been dropped as the way ticks are picked has changed.

Many test images have changed!

Implementation notes:
- A bug in numpy's implementation of divmod (numpy/numpy#6127) is worked
around.
- The implementation of scale_range has also been cleaned.

See matplotlib#5767, matplotlib#5738.
jenshnielsen pushed a commit to jenshnielsen/matplotlib that referenced this issue Mar 20, 2016
plt.plot([-.1, .2]) used to pick (in round numbers mode) [-.1, .25] as
ylims due to floating point inaccuracies; change it to pick [-.1, .2]
(up to floating point inaccuracies).

Support for the (unused and deprecated-in-comment) "trim" keyword
argument has been dropped as the way ticks are picked has changed.

Many test images have changed!

Implementation notes:
- A bug in numpy's implementation of divmod (numpy/numpy#6127) is worked
around.
- The implementation of scale_range has also been cleaned.

See matplotlib#5767, matplotlib#5738.
jenshnielsen pushed a commit to jenshnielsen/matplotlib that referenced this issue Mar 20, 2016
plt.plot([-.1, .2]) used to pick (in round numbers mode) [-.1, .25] as
ylims due to floating point inaccuracies; change it to pick [-.1, .2]
(up to floating point inaccuracies).

Support for the (unused and deprecated-in-comment) "trim" keyword
argument has been dropped as the way ticks are picked has changed.

Many test images have changed!

Implementation notes:
- A bug in numpy's implementation of divmod (numpy/numpy#6127) is worked
around.
- The implementation of scale_range has also been cleaned.

See matplotlib#5767, matplotlib#5738.
jaimefrio pushed a commit to jaimefrio/numpy that referenced this issue Mar 22, 2016
This is apropos numpy#6127. The fix is to make the functions floor_division
and remainder consistent, i.e.,

    b * floor_division(a, b) + remainder(a, b) == a

Previous to this fix remainder was computed a the C level using the '%'
operator, and the result was not always consistent with the floor
function. The current approach is to compute the remainder using

    b * (a/b - floor(a/b))

which is both consistent with the Python '%' operator and numerically
consistent with floor_division implemented using the floor function.

Closes numpy#6127.
tacaswell added a commit to tacaswell/matplotlib that referenced this issue Apr 2, 2016
This test only over passed due to an error arising from a bug
in numpy's divmod being fixed (numpy/numpy#6127).

See matplotlib#5950
tacaswell added a commit to tacaswell/matplotlib that referenced this issue Apr 2, 2016
This test only over passed due to an error arising from a bug
in numpy's divmod being fixed (numpy/numpy#6127).

See matplotlib#5950
tacaswell added a commit to tacaswell/matplotlib that referenced this issue Apr 3, 2016
This test only over passed due to an error arising from a bug
in numpy's divmod being fixed (numpy/numpy#6127).

See matplotlib#5950
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants