-
-
Notifications
You must be signed in to change notification settings - Fork 142
GH1173 Experiment with fuller typing #1193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
You'd still need to have the |
I was able to make some progress here, let me know if this is what you envisioned and I will clean it up. |
Hey - yeah, this is what I was thinking - I don't really understand why
it looks like Irv disagrees and that what I've suggested may be against the pandas-stubs philosophy - which is fair enough |
So I looked more carefully at the pandas docs, which do say that the accepted So I'll look more carefully at this PR now with that in mind. |
@overload | ||
def query( | ||
self, | ||
expr: _str, | ||
*, | ||
inplace: Literal[True], | ||
**kwargs: Any, | ||
) -> None: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not clear why you need this overload when the previous overload covers it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test would raise a mypy/pyright error without the other overload.
pandas-stubs/tests/test_frame.py
Lines 532 to 536 in d19ac89
kwargs = {"parser": "pandas", "engine": "numexpr"} | |
check( | |
assert_type(df.query("col1 > col2", inplace=False, **kwargs), pd.DataFrame), | |
pd.DataFrame, | |
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My take on this is that we shouldn't support the kwargs
being passed in a dict
, because they can't be type-checked. The docs say what kwargs
are acceptable, via the eval()
link, so let's not include this test (and the overloads supporting kwargs
def query( | ||
self, | ||
expr: _str, | ||
*, | ||
inplace: Literal[False] = ..., | ||
**kwargs: Any, | ||
) -> Self: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment - not sure why we need this overload when the previous one covers it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal was to still allow for someone passing kwargs as a dictionary instead of the individual arguments since it is the way documented in the docs like df.query("col1 > col2", **kwargs)
that you would pass from another function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is that someone may put a wrapper like:
def my_own_query(df, expr, **kwargs):
return df.query(expr, **kwargs)
If you drop the second overload this will raise an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the goal here is to restrict the possible keyword arguments, so we should remove the overload.
tests/test_frame.py
Outdated
check( | ||
assert_type( | ||
df.query("col1 > col2", parser="pandas", engine="numexpr"), pd.DataFrame | ||
), | ||
pd.DataFrame, | ||
) | ||
check( | ||
assert_type( | ||
df.query("col1 > col2", parser="pandas", engine="numexpr"), pd.DataFrame | ||
), | ||
pd.DataFrame, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
duplicate tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
I will raise an issue on the pandas side if it is not simpler to have all the arguments directly in the function. It is a good point since it should not be too hard to maintain. |
It's a docs issue, so raise an issue there that it would be more useful to list the arguments under |
Opened the issue on pandas repo, will mark this PR as blocked for now until the pandas team can clarify orientation on this. |
I think we should remove the |
kwargs
inDataFrame.query
according toDataFrame.eval
#1173assert_type()
to assert the type of any return value@MarcoGorelli I took a try at your idea of typehinting the whole set of arguments in
pd.DataFrame.query
but stumbling upon the issue when the user passes a dictionary (which is still an allowed behavior).Wondering if there is something I am missing here, please let me know!