Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit f361515

Browse files
committed
A quicker astimezone() implementation, rehabilitating an earlier
suggestion from Guido, along with a formal correctness proof of the trickiest bit. The intricacy of the proof reveals how delicate this is, but also how robust the conclusion: correctness doesn't rely on dst() returning +- one hour (not all real time zones do!), it only relies on: 1. That dst() returns a (any) non-zero value if and only if daylight time is in effect. and 2. That the tzinfo subclass implements a consistent notion of time zone. The meaning of "consistent" was a hidden assumption, which is now an explicit requirement in the docs. Alas, it's an unverifiable (by the datetime implementation) requirement, but so it goes.
1 parent 0233bd9 commit f361515

3 files changed

Lines changed: 199 additions & 71 deletions

File tree

Doc/lib/libdatetime.tex

Lines changed: 23 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -887,10 +887,10 @@ \subsection{\class{tzinfo} Objects \label{datetime-tzinfo}}
887887
the magnitude of the offset must be less than one day), or a
888888
\class{timedelta} object representing a whole number of minutes
889889
in the same range. Most implementations of \method{utcoffset()}
890-
will probably look like:
890+
will probably look like one of these two:
891891

892892
\begin{verbatim}
893-
return CONSTANT # fixed-offset class
893+
return CONSTANT # fixed-offset class
894894
return CONSTANT + self.dst(dt) # daylight-aware class
895895
\end{verbatim}
896896
\end{methoddesc}
@@ -905,12 +905,13 @@ \subsection{\class{tzinfo} Objects \label{datetime-tzinfo}}
905905
rather than a fixed string primarily because some \class{tzinfo} objects
906906
will wish to return different names depending on the specific value
907907
of \var{dt} passed, especially if the \class{tzinfo} class is
908-
accounting for DST.
908+
accounting for daylight time.
909909
\end{methoddesc}
910910

911911
\begin{methoddesc}{dst}{self, dt}
912-
Return the DST offset, in minutes east of UTC, or \code{None} if
913-
DST information isn't known. Return \code{0} if DST is not in effect.
912+
Return the daylight savings time (DST) adjustment, in minutes east of
913+
UTC, or \code{None} if DST information isn't known. Return \code{0} if
914+
DST is not in effect.
914915
If DST is in effect, return the offset as an integer or
915916
\class{timedelta} object (see \method{utcoffset()} for details).
916917
Note that DST offset, if applicable, has
@@ -919,7 +920,23 @@ \subsection{\class{tzinfo} Objects \label{datetime-tzinfo}}
919920
unless you're interested in displaying DST info separately. For
920921
example, \method{datetimetz.timetuple()} calls its \member{tzinfo}
921922
member's \method{dst()} method to determine how the
922-
\member{tm_isdst} flag should be set.
923+
\member{tm_isdst} flag should be set, and
924+
\method{datetimetz.astimezone()} calls \method{dst()} to account for
925+
DST changes when crossing time zones.
926+
927+
An instance \var{tz} of a \class{tzinfo} subclass that models both
928+
standard and daylight times must be consistent in this sense:
929+
930+
\code{tz.utcoffset(dt) - tz.dst(dt)}
931+
932+
must return the same result for every \class{datetimetz} \var{dt}
933+
in a given year with \code{dt.tzinfo==tz} For sane \class{tzinfo}
934+
subclasses, this expression yields the time zone's "standard offset"
935+
within the year, which should be the same across all days in the year.
936+
The implementation of \method{datetimetz.astimezone()} relies on this,
937+
but cannot detect violations; it's the programmer's responsibility to
938+
ensure it.
939+
923940
\end{methoddesc}
924941

925942
These methods are called by a \class{datetimetz} or \class{timetz} object,

Lib/test/test_datetime.py

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2703,6 +2703,31 @@ def test_easy(self):
27032703
# self.convert_between_tz_and_utc(Eastern, Central) # can't work
27042704
# self.convert_between_tz_and_utc(Central, Eastern) # can't work
27052705

2706+
def test_tricky(self):
2707+
# 22:00 on day before daylight starts.
2708+
fourback = self.dston - timedelta(hours=4)
2709+
ninewest = FixedOffset(-9*60, "-0900", 0)
2710+
fourback = fourback.astimezone(ninewest)
2711+
# 22:00-0900 is 7:00 UTC == 2:00 EST == 3:00 DST. Since it's "after
2712+
# 2", we should get the 3 spelling.
2713+
# If we plug 22:00 the day before into Eastern, it "looks like std
2714+
# time", so its offset is returned as -5, and -5 - -9 = 4. Adding 4
2715+
# to 22:00 lands on 2:00, which makes no sense in local time (the
2716+
# local clock jumps from 1 to 3). The point here is to make sure we
2717+
# get the 3 spelling.
2718+
expected = self.dston.replace(hour=3)
2719+
got = fourback.astimezone(Eastern).astimezone(None)
2720+
self.assertEqual(expected, got)
2721+
2722+
# Similar, but map to 6:00 UTC == 1:00 EST == 2:00 DST. In that
2723+
# case we want the 1:00 spelling.
2724+
sixutc = self.dston.replace(hour=6).astimezone(utc_real)
2725+
# Now 6:00 "looks like daylight", so the offset wrt Eastern is -4,
2726+
# and adding -4-0 == -4 gives the 2:00 spelling. We want the 1:00 EST
2727+
# spelling.
2728+
expected = self.dston.replace(hour=1)
2729+
got = sixutc.astimezone(Eastern).astimezone(None)
2730+
self.assertEqual(expected, got)
27062731

27072732
def test_suite():
27082733
allsuites = [unittest.makeSuite(klass, 'test')

Modules/datetimemodule.c

Lines changed: 151 additions & 65 deletions
Original file line numberDiff line numberDiff line change
@@ -4753,8 +4753,9 @@ datetimetz_astimezone(PyDateTime_DateTimeTZ *self, PyObject *args,
47534753

47544754
PyObject *result;
47554755
PyObject *temp;
4756-
int selfoff, resoff, tempoff, total_added_to_result;
4756+
int selfoff, resoff, resdst, total_added_to_result;
47574757
int none;
4758+
int delta;
47584759

47594760
PyObject *tzinfo;
47604761
static char *keywords[] = {"tz", NULL};
@@ -4788,36 +4789,17 @@ datetimetz_astimezone(PyDateTime_DateTimeTZ *self, PyObject *args,
47884789
if (none)
47894790
return result;
47904791

4791-
/* Add resoff-selfoff to result. */
4792-
total_added_to_result = resoff - selfoff;
4793-
mm += total_added_to_result;
4794-
if ((mm < 0 || mm >= 60) &&
4795-
normalize_datetime(&y, &m, &d, &hh, &mm, &ss, &us) < 0)
4796-
goto Fail;
4797-
temp = new_datetimetz(y, m, d, hh, mm, ss, us, tzinfo);
4798-
if (temp == NULL)
4799-
goto Fail;
4800-
Py_DECREF(result);
4801-
result = temp;
4802-
4803-
/* If tz is a fixed-offset class, we're done, but we can't know
4804-
* whether it is. If it's a DST-aware class, and we're not near a
4805-
* DST boundary, we're also done. If we crossed a DST boundary,
4806-
* the offset will be different now, and that's our only clue.
4807-
* Unfortunately, we can be in trouble even if we didn't cross a
4808-
* DST boundary, if we landed on one of the DST "problem hours".
4792+
/* See the long comment block at the end of this file for an
4793+
* explanation of this algorithm. That it always works requires a
4794+
* pretty intricate proof.
48094795
*/
4810-
tempoff = call_utcoffset(tzinfo, result, &none);
4811-
if (tempoff == -1 && PyErr_Occurred())
4796+
resdst = call_dst(tzinfo, result, &none);
4797+
if (resdst == -1 && PyErr_Occurred())
48124798
goto Fail;
4813-
if (none)
4814-
goto Inconsistent;
4815-
4816-
if (tempoff != resoff) {
4817-
/* We did cross a boundary. Try to correct. */
4818-
const int delta = tempoff - resoff;
4819-
total_added_to_result += delta;
4820-
mm += delta;
4799+
/* None and 0 dst() results are the same to us here. Debatable. */
4800+
total_added_to_result = resoff - resdst - selfoff;
4801+
if (total_added_to_result != 0) {
4802+
mm += total_added_to_result;
48214803
if ((mm < 0 || mm >= 60) &&
48224804
normalize_datetime(&y, &m, &d, &hh, &mm, &ss, &us) < 0)
48234805
goto Fail;
@@ -4832,50 +4814,42 @@ datetimetz_astimezone(PyDateTime_DateTimeTZ *self, PyObject *args,
48324814
goto Fail;
48334815
if (none)
48344816
goto Inconsistent;
4835-
}
4836-
/* If this is the first hour of DST, it may be a local time that
4837-
* doesn't make sense on the local clock, in which case the naive
4838-
* hour before it (in standard time) is equivalent and does make
4839-
* sense on the local clock. So force that.
4817+
}
4818+
4819+
/* The distance now from self to result is
4820+
* self - result == naive(self) - selfoff - (naive(result) - resoff) ==
4821+
* naive(self) - selfoff -
4822+
* ((naive(self) + total_added_to_result - resoff) ==
4823+
* - selfoff - total_added_to_result + resoff.
48404824
*/
4841-
hh -= 1;
4842-
if (hh < 0 && normalize_datetime(&y, &m, &d, &hh, &mm, &ss, &us) < 0)
4825+
delta = resoff - selfoff - total_added_to_result;
4826+
4827+
/* Now self and result are the same UTC time iff delta is 0.
4828+
* If it is 0, we're done, although that takes some proving.
4829+
*/
4830+
if (delta == 0)
4831+
return result;
4832+
4833+
total_added_to_result += delta;
4834+
mm += delta;
4835+
if ((mm < 0 || mm >= 60) &&
4836+
normalize_datetime(&y, &m, &d, &hh, &mm, &ss, &us) < 0)
48434837
goto Fail;
4838+
48444839
temp = new_datetimetz(y, m, d, hh, mm, ss, us, tzinfo);
48454840
if (temp == NULL)
48464841
goto Fail;
4847-
tempoff = call_utcoffset(tzinfo, temp, &none);
4848-
if (tempoff == -1 && PyErr_Occurred()) {
4849-
Py_DECREF(temp);
4842+
Py_DECREF(result);
4843+
result = temp;
4844+
4845+
resoff = call_utcoffset(tzinfo, result, &none);
4846+
if (resoff == -1 && PyErr_Occurred())
48504847
goto Fail;
4851-
}
4852-
if (none) {
4853-
Py_DECREF(temp);
4848+
if (none)
48544849
goto Inconsistent;
4855-
}
4856-
/* Are temp and result really the same time? temp == result iff
4857-
* temp - tempoff == result - resoff, iff
4858-
* (result - HOUR) - tempoff = result - resoff, iff
4859-
* resoff - tempoff == HOUR
4860-
*/
4861-
if (resoff - tempoff == 60) {
4862-
/* use the local time that makes sense */
4863-
Py_DECREF(result);
4864-
return temp;
4865-
}
4866-
Py_DECREF(temp);
48674850

4868-
/* There's still a problem with the unspellable (in local time)
4869-
* hour after DST ends. If self and result map to the same UTC time
4870-
* time, we're OK, else the hour is unrepresentable in the tzinfo
4871-
* zone. The result's local time now is
4872-
* self + total_added_to_result, so self == result iff
4873-
* self - selfoff == result - resoff, iff
4874-
* self - selfoff == (self + total_added_to_result) - resoff, iff
4875-
* - selfoff == total_added_to_result - resoff, iff
4876-
* total_added_to_result == resoff - selfoff
4877-
*/
4878-
if (total_added_to_result == resoff - selfoff)
4851+
if (resoff - selfoff == total_added_to_result)
4852+
/* self and result are the same UTC time */
48794853
return result;
48804854

48814855
/* Else there's no way to spell self in zone tzinfo. */
@@ -5498,3 +5472,115 @@ initdatetime(void)
54985472
if (us_per_hour == NULL || us_per_day == NULL || us_per_week == NULL)
54995473
return;
55005474
}
5475+
5476+
/* ---------------------------------------------------------------------------
5477+
Some time zone algebra. For a datetimetz x, let
5478+
x.n = x stripped of its timezone -- its naive time.
5479+
x.o = x.utcoffset(), and assuming that doesn't raise an exception or
5480+
return None
5481+
x.d = x.dst(), and assuming that doesn't raise an exception or
5482+
return None
5483+
x.s = x's standard offset, x.o - x.d
5484+
5485+
Now some derived rules, where k is a duration (timedelta).
5486+
5487+
1. x.o = x.s + x.d
5488+
This follows from the definition of x.s.
5489+
5490+
2. If x and y have the same tzinfo member, x.s == y.s.
5491+
This is actually a requirement, an assumption we need to make about
5492+
sane tzinfo classes.
5493+
5494+
3. The naive UTC time corresponding to x is x.n - x.o.
5495+
This is again a requirement for a sane tzinfo class.
5496+
5497+
4. (x+k).s = x.s
5498+
This follows from #2, and that datimetimetz+timedelta preserves tzinfo.
5499+
5500+
5. (y+k).n = y.n + k
5501+
Again follows from how arithmetic is defined.
5502+
5503+
Now we can explain x.astimezone(tz). Let's assume it's an interesting case
5504+
(meaning that the various tzinfo methods exist, and don't blow up or return
5505+
None when called).
5506+
5507+
The function wants to return a datetimetz y with timezone tz, equivalent to x.
5508+
5509+
By #3, we want
5510+
5511+
y.n - y.o = x.n - x.o [1]
5512+
5513+
The algorithm starts by attaching tz to x.n, and calling that y. So
5514+
x.n = y.n at the start. Then it wants to add a duration k to y, so that [1]
5515+
becomes true; in effect, we want to solve [2] for k:
5516+
5517+
(y+k).n - (y+k).o = x.n - x.o [2]
5518+
5519+
By #1, this is the same as
5520+
5521+
(y+k).n - ((y+k).s + (y+k).d) = x.n - x.o [3]
5522+
5523+
By #5, (y+k).n = y.n + k, which equals x.n + k because x.n=y.n at the start.
5524+
Substituting that into [3],
5525+
5526+
x.n + k - (y+k).s - (y+k).d = x.n - x.o; the x.n terms cancel, leaving
5527+
k - (y+k).s - (y+k).d = - x.o; rearranging,
5528+
k = (y+k).s - x.o - (y+k).d; by #4, (y+k).s == y.s, so
5529+
k = y.s - x.o - (y+k).d; then by #1, y.s = y.o - y.d, so
5530+
k = y.o - y.d - x.o - (y+k).d
5531+
5532+
On the RHS, (y+k).d can't be computed directly, but all the rest can be, and
5533+
we approximate k by ignoring the (y+k).d term at first. Note that k can't
5534+
be very large, since all offset-returning methods return a duration of
5535+
magnitude less than 24 hours. For that reason, if y is firmly in std time,
5536+
(y+k).d must be 0, so ignoring it has no consequence then.
5537+
5538+
In any case, the new value is
5539+
5540+
z = y + y.o - y.d - x.o
5541+
5542+
If
5543+
z.n - z.o = x.n - x.o [4]
5544+
5545+
then, we have an equivalent time, and are almost done. The insecurity here is
5546+
at the start of daylight time. Picture US Eastern for concreteness. The wall
5547+
time jumps from 1:59 to 3:00, and wall hours of the form 2:MM don't make good
5548+
sense then. A sensible Eastern tzinfo class will consider such a time to be
5549+
EDT (because it's "after 2"), which is a redundant spelling of 1:MM EST on the
5550+
day DST starts. We want to return the 1:MM EST spelling because that's
5551+
the only spelling that makes sense on the local wall clock.
5552+
5553+
Claim: When [4] is true, we have "the right" spelling in this endcase. No
5554+
further adjustment is necessary.
5555+
5556+
Proof: The right spelling has z.d = 0, and the wrong spelling has z.d != 0
5557+
(for US Eastern, the wrong spelling has z.d = 60 minutes, but we can't assume
5558+
that all time zones work this way -- we can assume a time zone is in daylight
5559+
time iff dst() doesn't return 0). By [4], and recalling that z.o = z.s + z.d,
5560+
5561+
z.n - z.s - z.d = x.n - x.o [5]
5562+
5563+
Also
5564+
5565+
z.n = (y + y.o - y.d - x.o).n by the construction of z, which equals
5566+
y.n + y.o - y.d - x.o by #5.
5567+
5568+
Plugging that into [5],
5569+
5570+
y.n + y.o - y.d - x.o - z.s - z.d = x.n - x.o; cancelling the x.o terms,
5571+
y.n + y.o - y.d - z.s - z.d = x.n; but x.n = y.n too, so they also cancel,
5572+
y.o - y.d - z.s - z.d = 0; then y.o = y.s + y.d, so
5573+
y.s + y.d - y.d - z.s - z.d = 0; then the y.d terms cancel,
5574+
y.s - z.s - z.d = 0; but y and z are in the same timezone, so by #2
5575+
y.s = z.s, and they also cancel, leaving
5576+
- z.d = 0; or,
5577+
z.d = 0
5578+
5579+
Therefore z is the standard-time spelling, and there's nothing left to do in
5580+
this case.
5581+
5582+
Note that we actually proved something stronger: when [4] is true, it must
5583+
also be true that z.dst() returns 0.
5584+
5585+
XXX Flesh out the rest of the algorithm.
5586+
--------------------------------------------------------------------------- */

0 commit comments

Comments
 (0)