-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
scalar categoricals are sometimes interpreted as data keys #9844
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This behavior happens with any scalar, and so I'm wondering if the solution isn't something like locking out scaler plotting that aren't keys in data if data is present. from matplotlib import pyplot as plt
fig, axs = plt.subplots(2)
axs[0].bar(10, 1, data={10: 1})
axs[1].bar(20, 1, data={10: 1})
plt.show() |
And looking the example I put down, it's even more buggy 'cause it's not registering the scaler as the key at all (floating point error?). |
This issue has been marked "inactive" because it has been 365 days since the last comment. If this issue is still present in recent Matplotlib releases, or the feature request is still wanted, please leave a comment and this label will be removed. If there are no updates in another 30 days, this issue will be automatically closed, but you are free to re-open or create a new issue if needed. We value issue reports, and this procedure is meant to help us resurface and prioritize issues that have not been addressed yet, not make them disappear. Thanks for your help! |
I'm not sure there is much that can be done here... There are two rules that come into play with handling strings as data:
Necessarily, they must be resolved in that order. If they were resolved in the other order, categorical handling would always take place, and thus the data kwarg would be useless. As for non-strings as keys of the data kwarg, I think that is unsupported undefined behavior: The docstring entry for
As of (at least) #10928, that the keys are strings to have any data kwarg behavior is strictly enforced I suppose the "not allowed to pass categoricals as scalars" is potentially a solution (as that would mean that strings are not expected for any parameter except via the data kwarg)... but that is a rather large API change for a rather narrow set of interactions, though perhaps it would have other benefits (and I'm not quite sure how easy it would be to change that, actually). If we are unwilling to deprecate using bare strings for either of those cases, though, I don't see any other behavior change, perhaps some docs changes, but even then not sure where. |
This issue has been marked "inactive" because it has been 365 days since the last comment. If this issue is still present in recent Matplotlib releases, or the feature request is still wanted, please leave a comment and this label will be removed. If there are no updates in another 30 days, this issue will be automatically closed, but you are free to re-open or create a new issue if needed. We value issue reports, and this procedure is meant to help us resurface and prioritize issues that have not been addressed yet, not make them disappear. Thanks for your help! |
IMHO this is a lost cause. There are cases outside of categoricals, in which data-replaceable args can rightfully be strings. Consider the following realistic example
We cannot reasonably prohibit the second case. As a consequence, it would be very tedious to tell acceptable string scalars from illegal string scalars apart. I propose to close this. And while "silently changing the interpretation of input based on whether it is present or not in another dict is a bit finnicky" is not optimal, the behavior is consistent: "If a string, try to look it up in
to
|
Indeed, this is a good example regarding the difficulty to fix this. |
You're right it's not much clearer. I find the parentheses in the original message a bit confusing, because (1) why put in parentheses (2) what happens if that raises? What we need to communicate, is
Can't think of a nice wording for this right now. Help welcome. |
Is it not just "if s is a key of data"? |
I'm not quite sure that key is universally understood. For example I don't think columns of a DataFrame or fields of a structured numpy array are commonly referred to as keys. I believe explicitly using |
Certainly pandas and xarray both refer to keys, and that is the terminology for dictionaries. If people don't know that term when using data structures they should probably learn. However I agree it is also helpful to have data[s]. My phrasing above was not meant to replace the whole sentence, but is less awkward than "data[s] raises and exception". |
Bug report
Bug summary
Code for reproduction
Actual outcome
The first plot's x-value ("thing") is interpreted as a lookup into the data dict and replaced by 1.
The second plot's x-value does not appear in the data dict and is interpreted as a categorical.
Expected outcome
Not sure what the best option is, but silently changing the interpretation of input based on whether it is present or not in another dict seems a bit finnicky. I have proposed in other places to not allow scalar categoricals at all (always need to be passed in a container -- list, array, dataframe, etc.), which also solves the inconsistency that
plot(1, "x")
currently specs a marker whereasplot("x", 1)
treats"x"
as a categorical.Unlike other categorical issues I don't actually think this is release critical per se, but it would still be nice to get the behavior clarified/simplified...
Matplotlib version
print(matplotlib.get_backend())
):The text was updated successfully, but these errors were encountered: