Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit ab86c2b

Browse files
committed
k_mul() comments: In honor of Dijkstra, made the proof that "t3 fits"
rigorous instead of hoping for testing not to turn up counterexamples. Call me heretical, but despite that I'm wholly confident in the proof, and have done it two different ways now, I still put more faith in testing ...
1 parent 9973d74 commit ab86c2b

1 file changed

Lines changed: 37 additions & 34 deletions

File tree

Objects/longobject.c

Lines changed: 37 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -1757,40 +1757,43 @@ k_mul(PyLongObject *a, PyLongObject *b)
17571757

17581758
/* (*) Why adding t3 can't "run out of room" above.
17591759
1760-
We allocated space for asize + bsize result digits. We're adding t3 at an
1761-
offset of shift digits, so there are asize + bsize - shift allocated digits
1762-
remaining. Because degenerate shifts of "a" were weeded out, asize is at
1763-
least shift + 1. If bsize is odd then bsize == 2*shift + 1, else bsize ==
1764-
2*shift. Therefore there are at least shift+1 + 2*shift - shift =
1765-
1766-
2*shift+1 allocated digits remaining when bsize is even, or at least
1767-
2*shift+2 allocated digits remaining when bsize is odd.
1768-
1769-
Now in bh+bl, if bsize is even bh has at most shift digits, while if bsize
1770-
is odd bh has at most shift+1 digits. The sum bh+bl has at most
1771-
1772-
shift digits plus 1 bit when bsize is even
1773-
shift+1 digits plus 1 bit when bsize is odd
1774-
1775-
The same is true of ah+al, so (ah+al)(bh+bl) has at most
1776-
1777-
2*shift digits + 2 bits when bsize is even
1778-
2*shift+2 digits + 2 bits when bsize is odd
1779-
1780-
If bsize is even, we have at most 2*shift digits + 2 bits to fit into at
1781-
least 2*shift+1 digits. Since a digit has SHIFT bits, and SHIFT >= 2,
1782-
there's always enough room to fit the 2 bits into the "spare" digit.
1783-
1784-
If bsize is odd, we have at most 2*shift+2 digits + 2 bits to fit into at
1785-
least 2*shift+2 digits, and there's not obviously enough room for the
1786-
extra two bits. We need a sharper analysis in this case. The major
1787-
laziness was in the "the same is true of ah+al" clause: ah+al can't actually
1788-
have shift+1 digits + 1 bit unless bsize is odd and asize == bsize. In that
1789-
case, we actually have (2*shift+1)*2 - shift = 3*shift+2 allocated digits
1790-
remaining, and that's obviously plenty to hold 2*shift+2 digits + 2 bits.
1791-
Else (bsize is odd and asize < bsize) ah and al each have at most shift digits,
1792-
so ah+al has at most shift digits + 1 bit, and (ah+al)*(bh+bl) has at most
1793-
2*shift+1 digits + 2 bits, and again 2*shift+2 digits is enough to hold it.
1760+
Let f(x) mean the floor of x and c(x) mean the ceiling of x. Some facts
1761+
to start with:
1762+
1763+
1. For any integer i, i = c(i/2) + f(i/2). In particular,
1764+
bsize = c(bsize/2) + f(bsize/2).
1765+
2. shift = f(bsize/2)
1766+
3. asize <= bsize
1767+
4. Since we call k_lopsided_mul if asize*2 <= bsize, asize*2 > bsize in this
1768+
routine, so asize > bsize/2 >= f(bsize/2) in this routine.
1769+
1770+
We allocated asize + bsize result digits, and add t3 into them at an offset
1771+
of shift. This leaves asize+bsize-shift allocated digit positions for t3
1772+
to fit into, = (by #1 and #2) asize + f(bsize/2) + c(bsize/2) - f(bsize/2) =
1773+
asize + c(bsize/2) available digit positions.
1774+
1775+
bh has c(bsize/2) digits, and bl at most f(size/2) digits. So bh+hl has
1776+
at most c(bsize/2) digits + 1 bit.
1777+
1778+
If asize == bsize, ah has c(bsize/2) digits, else ah has at most f(bsize/2)
1779+
digits, and al has at most f(bsize/2) digits in any case. So ah+al has at
1780+
most (asize == bsize ? c(bsize/2) : f(bsize/2)) digits + 1 bit.
1781+
1782+
The product (ah+al)*(bh+bl) therefore has at most
1783+
1784+
c(bsize/2) + (asize == bsize ? c(bsize/2) : f(bsize/2)) digits + 2 bits
1785+
1786+
and we have asize + c(bsize/2) available digit positions. We need to show
1787+
this is always enough. An instance of c(bsize/2) cancels out in both, so
1788+
the question reduces to whether asize digits is enough to hold
1789+
(asize == bsize ? c(bsize/2) : f(bsize/2)) digits + 2 bits. If asize < bsize,
1790+
then we're asking whether asize digits >= f(bsize/2) digits + 2 bits. By #4,
1791+
asize is at least f(bsize/2)+1 digits, so this in turn reduces to whether 1
1792+
digit is enough to hold 2 bits. This is so since SHIFT=15 >= 2. If
1793+
asize == bsize, then we're asking whether bsize digits is enough to hold
1794+
f(bsize/2) digits + 2 bits, or equivalently (by #1) whether c(bsize/2) digits
1795+
is enough to hold 2 bits. This is so if bsize >= 1, which holds because
1796+
bsize >= KARATSUBA_CUTOFF >= 1.
17941797
17951798
Note that since there's always enough room for (ah+al)*(bh+bl), and that's
17961799
clearly >= each of ah*bh and al*bl, there's always enough room to subtract

0 commit comments

Comments
 (0)