scalar categoricals are sometimes interpreted as data keys #9844

anntzer · 2017-11-24T00:29:13Z

Bug report

Bug summary

Code for reproduction

from matplotlib import pyplot as plt
fig, axs = plt.subplots(2)
axs[0].bar("thing", 1, data={"thing": 1})
axs[1].bar("other", 1, data={"thing": 1})
plt.show()

Actual outcome

The first plot's x-value ("thing") is interpreted as a lookup into the data dict and replaced by 1.
The second plot's x-value does not appear in the data dict and is interpreted as a categorical.

Expected outcome

Not sure what the best option is, but silently changing the interpretation of input based on whether it is present or not in another dict seems a bit finnicky. I have proposed in other places to not allow scalar categoricals at all (always need to be passed in a container -- list, array, dataframe, etc.), which also solves the inconsistency that plot(1, "x") currently specs a marker whereas plot("x", 1) treats "x" as a categorical.

Unlike other categorical issues I don't actually think this is release critical per se, but it would still be nice to get the behavior clarified/simplified...

Matplotlib version

Operating system:
Matplotlib version: 2.1
Matplotlib backend (print(matplotlib.get_backend())):
Python version:
Jupyter version (if applicable):
Other libraries:

The text was updated successfully, but these errors were encountered:

story645 · 2017-11-28T17:26:05Z

This behavior happens with any scalar, and so I'm wondering if the solution isn't something like locking out scaler plotting that aren't keys in data if data is present.

from matplotlib import pyplot as plt
fig, axs = plt.subplots(2)
axs[0].bar(10, 1, data={10: 1})
axs[1].bar(20, 1, data={10: 1})
plt.show()

story645 · 2018-02-12T02:19:26Z

And looking the example I put down, it's even more buggy 'cause it's not registering the scaler as the key at all (floating point error?).

github-actions · 2023-04-25T01:54:16Z

This issue has been marked "inactive" because it has been 365 days since the last comment. If this issue is still present in recent Matplotlib releases, or the feature request is still wanted, please leave a comment and this label will be removed. If there are no updates in another 30 days, this issue will be automatically closed, but you are free to re-open or create a new issue if needed. We value issue reports, and this procedure is meant to help us resurface and prioritize issues that have not been addressed yet, not make them disappear. Thanks for your help!

ksunden · 2023-04-25T05:28:57Z

I'm not sure there is much that can be done here...

There are two rules that come into play with handling strings as data:

The data kwarg handling, which uses the string to access the value of a mapping
The categorical handling

Necessarily, they must be resolved in that order. If they were resolved in the other order, categorical handling would always take place, and thus the data kwarg would be useless.

As for non-strings as keys of the data kwarg, I think that is unsupported undefined behavior:

The docstring entry for data kwarg reads:

dataindexable object, optional

If given, all parameters also accept a string s, which is interpreted as data[s] (unless this raises an exception).

As of (at least) #10928, that the keys are strings to have any data kwarg behavior is strictly enforced

I suppose the "not allowed to pass categoricals as scalars" is potentially a solution (as that would mean that strings are not expected for any parameter except via the data kwarg)... but that is a rather large API change for a rather narrow set of interactions, though perhaps it would have other benefits (and I'm not quite sure how easy it would be to change that, actually).

If we are unwilling to deprecate using bare strings for either of those cases, though, I don't see any other behavior change, perhaps some docs changes, but even then not sure where.

github-actions · 2024-06-14T01:50:39Z

This issue has been marked "inactive" because it has been 365 days since the last comment. If this issue is still present in recent Matplotlib releases, or the feature request is still wanted, please leave a comment and this label will be removed. If there are no updates in another 30 days, this issue will be automatically closed, but you are free to re-open or create a new issue if needed. We value issue reports, and this procedure is meant to help us resurface and prioritize issues that have not been addressed yet, not make them disappear. Thanks for your help!

timhoffm · 2024-06-14T12:44:31Z

IMHO this is a lost cause. There are cases outside of categoricals, in which data-replaceable args can rightfully be strings. Consider the following realistic example

from matplotlib import pyplot as plt
fig, axs = plt.subplots(2)
data = {
    "x": [0, 1],
    "y": [1, 1], 
    "color": ["c", "m"],
}
axs[0].scatter("x", "y", facecolor="color", data=data)
axs[1].scatter("x", "y", facecolor="red", data=data)
plt.show()

We cannot reasonably prohibit the second case. As a consequence, it would be very tedious to tell acceptable string scalars from illegal string scalars apart.

I propose to close this. And while "silently changing the interpretation of input based on whether it is present or not in another dict is a bit finnicky" is not optimal, the behavior is consistent: "If a string, try to look it up in data otherwise proceed as normal". A mild improvement could be updating data description from

If given, the following parameters also accept a string s, which is interpreted as data[s] (unless this raises an exception):

to

If given, the following parameters also accept a string s, which is interpreted as data[s] if data[s] exists.

anntzer · 2024-06-15T12:03:13Z

Indeed, this is a good example regarding the difficulty to fix this.
I'm not sure about the proposed wording change; what does "data[s] exists" actually mean? (other than "the expression does not raise an exception"...)

timhoffm · 2024-06-15T13:37:28Z

You're right it's not much clearer. I find the parentheses in the original message a bit confusing, because (1) why put in parentheses (2) what happens if that raises? What we need to communicate, is

If the parameter is a string s, we try to look up data[s]. If that works, the resulting value is used for the parameter. If it fails, s itself is used (which may or may not be a valid input type for the parameter).

Can't think of a nice wording for this right now. Help welcome.

jklymak · 2024-06-15T14:23:09Z

Is it not just "if s is a key of data"?

timhoffm · 2024-06-15T15:11:04Z

I'm not quite sure that key is universally understood. For example I don't think columns of a DataFrame or fields of a structured numpy array are commonly referred to as keys. I believe explicitly using data[s] is helpful.

jklymak · 2024-06-15T16:55:14Z

Certainly pandas and xarray both refer to keys, and that is the terminology for dictionaries. If people don't know that term when using data structures they should probably learn. However I agree it is also helpful to have data[s]. My phrasing above was not meant to replace the whole sentence, but is less awkward than "data[s] raises and exception".

See matplotlib#9844 (comment)

anntzer added the topic: categorical label Nov 24, 2017

jklymak added this to the v2.1.1 milestone Nov 24, 2017

anntzer removed the topic: categorical label Nov 28, 2017

dstansby added the topic: categorical label Nov 30, 2017

tacaswell modified the milestones: v2.1.1, v2.2 Dec 6, 2017

story645 added the status: confirmed bug label Jul 24, 2019

github-actions bot added the status: inactive Marked by the “Stale” Github Action label Apr 25, 2023

github-actions bot removed the status: inactive Marked by the “Stale” Github Action label Apr 26, 2023

github-actions bot added the status: inactive Marked by the “Stale” Github Action label Jun 14, 2024

anntzer added keep Items to be ignored by the “Stale” Github Action and removed status: inactive Marked by the “Stale” Github Action labels Jun 14, 2024

timhoffm added a commit to timhoffm/matplotlib that referenced this issue Jun 15, 2024

DOC: Improve doc wording of data parameter

128ab72

See matplotlib#9844 (comment)

timhoffm added a commit to timhoffm/matplotlib that referenced this issue Jun 15, 2024

DOC: Improve doc wording of data parameter

76426a0

See matplotlib#9844 (comment)

timhoffm mentioned this issue Jun 15, 2024

DOC: Improve doc wording of data parameter #28400

Merged

trygvrad pushed a commit to trygvrad/matplotlib that referenced this issue Jun 25, 2024

DOC: Improve doc wording of data parameter

58b57fc

See matplotlib#9844 (comment)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

scalar categoricals are sometimes interpreted as data keys #9844

scalar categoricals are sometimes interpreted as data keys #9844

anntzer commented Nov 24, 2017

story645 commented Nov 28, 2017

Uh oh!

story645 commented Feb 12, 2018

Uh oh!

github-actions bot commented Apr 25, 2023

Uh oh!

ksunden commented Apr 25, 2023

Uh oh!

github-actions bot commented Jun 14, 2024

Uh oh!

timhoffm commented Jun 14, 2024 •

edited

Loading

Uh oh!

anntzer commented Jun 15, 2024

Uh oh!

timhoffm commented Jun 15, 2024

Uh oh!

jklymak commented Jun 15, 2024

Uh oh!

timhoffm commented Jun 15, 2024 •

edited

Loading

Uh oh!

jklymak commented Jun 15, 2024

Uh oh!

Uh oh!

scalar categoricals are sometimes interpreted as data keys #9844

scalar categoricals are sometimes interpreted as data keys #9844

Comments

anntzer commented Nov 24, 2017

Bug report

story645 commented Nov 28, 2017

Uh oh!

story645 commented Feb 12, 2018

Uh oh!

github-actions bot commented Apr 25, 2023

Uh oh!

ksunden commented Apr 25, 2023

Uh oh!

github-actions bot commented Jun 14, 2024

Uh oh!

timhoffm commented Jun 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

anntzer commented Jun 15, 2024

Uh oh!

timhoffm commented Jun 15, 2024

Uh oh!

jklymak commented Jun 15, 2024

Uh oh!

timhoffm commented Jun 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jklymak commented Jun 15, 2024

Uh oh!

timhoffm commented Jun 14, 2024 •

edited

Loading

timhoffm commented Jun 15, 2024 •

edited

Loading