-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Preparations for multivariate plotting #29877
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This commit introduces the MultiNorm calss to prepare for the introduction of multivariate plotting methods
return x | ||
else: | ||
# in case of a dtype with multiple fields: | ||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to get at least partial coverage for this branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't really been involved in this work nor understand how it works, but there is quite a bit of introduced code to deal with multiple datatypes? If this will be covered by tests/functionality in later PRs, that is fine, if not, please add tests for (most of) it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lib/matplotlib/colorizer.py
Outdated
if self.norm.n_output != cmap_obj.n_variates: | ||
raise ValueError(f"The colormap {cmap} does not support " | ||
f"{self.norm.n_output} variates as required by " | ||
f"the {type(self.norm)} on this Colorizer.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error messages typically have no end dot (same comment applies throughout).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I'll need to change this in the other PR as well.
mask = np.empty(x.shape, dtype=np.dtype('bool, '*len(x.dtype.descr))) | ||
for dd, dm in zip(x.dtype.descr, mask.dtype.descr): | ||
mask[dm[0]] = ~(np.isfinite(x[dd[0]])) | ||
xm = np.ma.array(x, mask=mask, copy=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do numpy masked arrays actually support struct arrays as mask, with possibly different masking of the fields?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have found that this is the only way numpy supports masking dtypes with multiple fields, but I will see if [("mask", bool, len(x.dtype.descr))]
as you suggest bellow is a reasonable approach to using a single mask.
else: | ||
# in case of a dtype with multiple fields: | ||
try: | ||
mask = np.empty(x.shape, dtype=np.dtype('bool, '*len(x.dtype.descr))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could the dtype be e.g. [("mask", bool, len(x.dtype.descr))]
(with a slightly different API)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an interesting idea. I'll make a prototype and see if this would add unnecessary complexity somewhere else.
54a945c
to
eeb895c
Compare
41acef7
to
9c62126
Compare
@anntzer I think this is important, so I wanted to reply to this in the main thread.
The context here is that mulrivariate data is stored internally as an array with a data type with multiple fields. It should be noted that when a regular np.array is masked, and the mask is I didn't actually get as far as to prototype this, but I did have a look around. I have found that it will largely involve changes to I have tried to list the advantages/disadvantages of the two approaches below: A: Use a masked array with a struct array.
Advantages:
Disadvantages:
B: store the mask as an additional dtype in the struct array i.e.
Advantages:
Disadvantages:
Having looked at this, my personal opinion is that option A is more suitable for matplotlib because I think it will be easier to maintain. @anntzer let me know if I have interpreted your suggestion correctly, and if you agree with my assessment of approach A or B, or if you think I should make a full prototype to explore this further. |
Thank you @QuLogic Co-authored-by: Elliott Sales de Andrade <[email protected]>
9c62126
to
a276d89
Compare
PR summary
This PR continues the work of #28658 and #28454 and #29876, aiming to close #14168. (Feature request: Bivariate colormapping)
This is part two of the former PR, #29221, and builds upon #29876. Please see #29221 for the previous discussion
#29876 includes:
MultiNorm
class. This is a subclass ofcolors.Normalize
and holdsn_variate
norms.MultiNorm
classThis PR includes in this PR:
Features not included in this PR:
MultiNorm
together withBivarColormap
andMultivarColormap
to the plotting functionsaxes.imshow(...)
,axes.pcolor
, and `axes.pcolormesh(...)