-
-
Notifications
You must be signed in to change notification settings - Fork 19.5k
comparing time series with index of different units #63466
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
87c158e
9d21e7c
cc2668b
fd19e02
7566471
45c435f
cb1e31f
492614b
7459fd0
0893e86
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -245,8 +245,27 @@ def equals(self, other: Any) -> bool: | |
|
|
||
| if not isinstance(other, Index): | ||
| return False | ||
| elif other.dtype.kind in "iufc": | ||
|
|
||
| if hasattr(other, "dtype") and other.dtype.kind in "iufc": | ||
| return False | ||
|
|
||
| if len(self) != len(other): | ||
| return False | ||
|
|
||
| self_unit = getattr(self, "unit", None) | ||
| other_unit = getattr(other, "unit", None) | ||
|
|
||
| if self_unit is not None and other_unit is not None and self_unit != other_unit: | ||
| if getattr(self.dtype, "tz", None) == getattr(other.dtype, "tz", None): | ||
| try: | ||
| other_values = other._values | ||
| if hasattr(other_values, "as_unit") and hasattr( | ||
| self._values, "equals" | ||
| ): | ||
| return self._values.equals(other_values.as_unit(self_unit)) | ||
|
Comment on lines
+260
to
+265
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In what cases does this raise?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. an example could be if we put two date_ranges to equal(), one with frequency "D" and other being "M", then if this scenario is hit, we safely get basically first check if other_values has units we must compare, second check if self can have these compared to it, then simply check and return if they are equal or not
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That makes sense, but why is this code itself in a try-except? This is why I am asking in what case does this code itself raise.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. during my testing i saw some scenarios where i see the as_unit(self_unit) raising I am new to pandas contribution, so i didnt know we dont like try-excepts here. You can let me know if i should remove it and i will commit that to this pr. This is a solution i have came up with, it passes the tests on my system locally. should i push this instead? if self_unit is not None and other_unit is not None and self_unit != other_unit:
if getattr(self.dtype, "tz", None) == getattr(other.dtype, "tz", None):
other_values = other._values
if hasattr(other_values, "as_unit") and hasattr(self._values, "equals"):
return self._values.equals(other_values.as_unit(self_unit)) |
||
| except (ValueError, TypeError, AttributeError): | ||
| return False | ||
|
|
||
| elif not isinstance(other, type(self)): | ||
| should_try = False | ||
| inferable = self._data._infer_matches | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -246,7 +246,7 @@ def test_intersection_same_timezone_different_units(self): | |
| # Test intersection | ||
| result = idx1.intersection(idx2) | ||
| expected = date_range("2000-01-01", periods=3, tz=tz).as_unit("ns") | ||
| tm.assert_index_equal(result, expected) | ||
| tm.assert_index_equal(result, expected, exact=False) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why does this need to change?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It relaxes the dtype strictness. In the context of datetimes, it stops caring whether the storage is in nanoseconds or microseconds, as long as the dates themselves represent the same point in history.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Prior to this PR the
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The test used to pass because the code was implicitly forcing everything into one format (nanoseconds). Now that the code correctly preserves the original units, the test fails because it expects a specific format that no longer matches the output. I changed the test to
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
|
||
| def test_symmetric_difference_same_timezone_different_units(self): | ||
| # GH 60080 - fix timezone being changed to UTC when units differ | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.