Adding 2d support to quadmesh set_array #16908

greglucas · 2020-03-25T22:11:45Z

PR Summary

Adds the ability to call set_array() with 2d input arrays, which are what users typically make quadmesh's with.

import matplotlib.pyplot as plt
import numpy as np
z = np.random.random((5, 5))
fig, ax = plt.subplots()
coll = ax.pcolormesh(np.ones(z.shape))
coll.set_array(z)

It still calls ravel() under the hood so that the private data is still 1-dimensional, it is just a convenience method for users. Relates to a portion of #15388.

PR Checklist

Has Pytest style unit tests
Code is Flake 8 compliant
New features are documented, with examples if plot related
Documentation is sphinx and numpydoc compliant
Added an entry to doc/users/next_whats_new/ if major new feature (follow instructions in README.rst there)
Documented in doc/api/api_changes.rst if API changed in a backward-incompatible way

greglucas · 2020-03-25T22:14:48Z

lib/matplotlib/collections.py

+    def set_array(self, A):
+        # Allow QuadMesh.set_array(A) to accept 2d input
+        # as long is it is the same size as the current 1d data
+        if A.ndim == 2 and A.size == self._A.size:


It might make sense to actually issue a warning here if A.size != self._A.size to address some common issues of the edges (n+1, m+1) compared to the data (n, m). Thoughts?

I'm a little confused as to why A needs to be raveled at all here? Why does QuadMesh store its array as 1-D?

QuLogic · 2020-03-26T22:52:37Z

I think we need to be careful here to handle all the cases from #16258.

jklymak

I guess this needs to drop the last row and column if the old style co-ercion is to take place?

As noted below, I don't see why A needs to be ravel-ed? Why don't we just store A in place as an array?

greglucas · 2020-03-27T14:27:14Z

I agree, @jklymak, I think that it would be ideal to store the actual data array. imshow already does this:
plt.imshow(np.random.random((5, 5))).get_array().ndim == 2

The current docstrings mentions vertices (a 2d mesh) and coordinates (n x 2) mesh, which doesn't really help the confusion here

matplotlib/lib/matplotlib/collections.py

Lines 1907 to 1932 in d9b722f

    
           class QuadMesh(Collection): 
        
               """ 
        
               Class for the efficient drawing of a quadrilateral mesh. 
        
               A quadrilateral mesh consists of a grid of vertices. 
        
               The dimensions of this array are (*meshWidth* + 1, *meshHeight* + 1). 
        
               Each vertex in the mesh has a different set of "mesh coordinates" 
        
               representing its position in the topology of the mesh. 
        
               For any values (*m*, *n*) such that 0 <= *m* <= *meshWidth* 
        
               and 0 <= *n* <= *meshHeight*, the vertices at mesh coordinates 
        
               (*m*, *n*), (*m*, *n* + 1), (*m* + 1, *n* + 1), and (*m* + 1, *n*) 
        
               form one of the quadrilaterals in the mesh. There are thus 
        
               (*meshWidth* * *meshHeight*) quadrilaterals in the mesh.  The mesh 
        
               need not be regular and the polygons need not be convex. 
        
               A quadrilateral mesh is represented by a (2 x ((*meshWidth* + 1) * 
        
               (*meshHeight* + 1))) numpy array *coordinates*, where each row is 
        
               the *x* and *y* coordinates of one of the vertices.  To define the 
        
               function that maps from a data point to its corresponding color, 
        
               use the :meth:`set_cmap` method.  Each of these arrays is indexed in 
        
               row-major order by the mesh coordinates of the vertex (or the mesh 
        
               coordinates of the lower left vertex, in the case of the colors). 
        
               For example, the first entry in *coordinates* is the coordinates of the 
        
               vertex at mesh coordinates (0, 0), then the one at (0, 1), then at (0, 2) 
        
               .. (0, meshWidth), (1, 0), (1, 1), and so on.

The reason I chose this simplistic thing was that I didn't want to go down the rabbit hole of deprecation issues. Would it be possible to change that at this point, even? My guess is that people may be using get_array() and set_array() with expected 1d arrays both ways, so I'm not sure how we would go about adding a 2d option without messing with someone's current 1d implementation.

Even more confusing through this all is that you can set_array() with a different shape than the coordinates or original data!
Try:
plt.imshow(np.random.random((5, 5))).set_array(np.random.random((5, 4)))
and look at where the ticks are.

and QuadMesh will actually tile your missing data for you:
plt.pcolormesh(np.arange(25).reshape((5, 5))).set_array(np.arange(10).reshape((5, 2)).ravel())

The two methods handle the different sized arrays differently. These seem like they could be quite confusing mistakes to track down!

After that long digression... I agree that this current implementation I have here is less than ideal and may lead to more confusion down the line. I would be willing to add 2d capabilities to QuadMesh if people think that is an OK idea. I also would be in favor of starting to add some requirements on set_array() depending on what is stored in the coordinates currently.

greglucas · 2020-03-29T17:14:19Z

I looked a little more into this and the current backend implementors expect the shape of coordinates to be (m+1, n+1, 2), and facecolors/edgecolors to be (m*n). This is baked into the backends already to be flattened arrays and my guess is it would possibly cause a headache to change this.

I just pushed up a new proposed modification that will allow set_array() to accept 2D arrays for QuadMesh now and ravel whatever data they have (either 1D or 2D) into the correct shape internally to pass onto the backends. It maintains current functionality and just extends it to allow 2D data arrays for QuadMesh's.

There could be more checks done on the Python side of things to make sure arrays are the right size etc... But, that also wasn't done before so I'm not sure it should be done.

jklymak

This seems an improvement to me. We can discuss doing something with more context later.

tacaswell · 2020-03-30T02:11:56Z

plt.imshow(np.random.random((5, 5))).set_array(np.random.random((5, 4))) ...

This is the expected behavior due to the way imshow works under the hood (it has both a data array and an extent see https://matplotlib.org/tutorials/intermediate/imshow_extent.html). The alternate behavior (the extent changing when you set the data) would be much more confusing.

Does fail loudly (if from a not great place) if you set miss-shapen data? Are there cases that used to fail that now pass?

tacaswell · 2020-03-30T02:13:38Z

lib/matplotlib/collections.py

@@ -782,7 +782,8 @@ def update_scalarmappable(self):
        """Update colors from the scalar mappable array, if it is not None."""
        if self._A is None:
            return
-        if self._A.ndim > 1:
+        # QuadMesh can map 2d arrays
+        if self._A.ndim > 1 and not isinstance(self, QuadMesh):


Could we do the raveling here as

if isinstance(self, QuadMesh): self._A = self._A.ravel() else: raise ValueError(...)

It keeps everything a bit more consistent shape wise?

That's actually what I had originally, but force-pushed over. The downside to it is if someone calls get_array() it is a different shape returned, which could cause confusion?

That is a fair point, but I am also worried about the shape stability of people who have code written against QuadMesh who are now going to be suprised that sometimes they get back 2d data.

That should only happen if they pass in 2d data, which was not possible before. So, I think all the before cases were 1d inputs and will return 1d inputs still. This is really for my selfish future motivation of wanting to call update animations without forgetting to ravel() and get the ValueError thrown my way. I completely agree though, it does add another layer of potential confusion and that should be weighed on pros/cons.

I am worried about the (hypothetical) that someone has written a function that takes in a QuadMesh, uses get_array(), and assumes 1d data. If we do the reshaping at the last minute then that assumption is no longer valid, but there is no way for the function author to reasonably know.

I do see both sides of this and neither is obviously better.

tacaswell · 2020-03-30T02:14:30Z

I am 👍 on this in principle, have a small concern that this is going to mask other bugs.

QuLogic · 2020-04-16T01:11:14Z

So do we need tests for the various other input shape combinations? I guess at least, it would be nice to have tests for invalid shapes.

greglucas · 2020-04-17T01:01:38Z

Unfortunately, there aren't really any "bad" shapes now. Quadmesh will accept flattened arrays that are larger or smaller no problem and then either chop the data off or tile it up for you respectively. See this long comment for the background there: #16908 (comment)

I'm hesitant to add a strict check on A's shape/size incase someone is (ab)using that feature. I did have that initially for just the 2d Quadmesh case (see: #16908 (comment)) but I took that away to just delegate the getting/setting of A to the superclass instead.

tacaswell · 2020-05-25T03:16:04Z

@greglucas we have a preference for rebasing rather than merging the master branch into feature branches. I took the liberty of doing the re-base and force pushed to your branch.

greglucas · 2020-05-25T13:05:57Z

Thanks! I also noticed that one of the comments was incorrect now too, so I pushed up a change for that just now.

efiring · 2020-05-25T18:49:10Z

I think this is OK as a logical improvement, but perhaps the next step should be to override Quadmesh.set_array so that it checks that the dimensions of its argument, whether 1-D or 2-D, are consistent with the mesh dimensions set when the Quadmesh was instantiated.

greglucas commented Mar 25, 2020

View reviewed changes

QuLogic requested a review from jklymak March 26, 2020 22:52

jklymak reviewed Mar 27, 2020

View reviewed changes

greglucas force-pushed the quadmesh_set_array branch from 639fa14 to 8ddf3ea Compare March 29, 2020 17:09

jklymak approved these changes Mar 29, 2020

View reviewed changes

tacaswell added this to the v3.3.0 milestone Mar 30, 2020

tacaswell reviewed Mar 30, 2020

View reviewed changes

QuLogic mentioned this pull request May 12, 2020

Support url on more Artists in svg #17338

Merged

4 tasks

tacaswell force-pushed the quadmesh_set_array branch from 095618d to 1df7e92 Compare May 25, 2020 03:15

tacaswell approved these changes May 25, 2020

View reviewed changes

Adding 2d support to quadmesh set_array

5e5ac01

greglucas force-pushed the quadmesh_set_array branch from 1df7e92 to 5e5ac01 Compare May 25, 2020 13:04

tacaswell merged commit 49593b7 into matplotlib:master May 25, 2020

efiring mentioned this pull request May 25, 2020

Quadmesh.set_array should validate dimensions #17508

Closed

tacaswell mentioned this pull request Jun 2, 2020

matplotlib.collections.QuadMesh.set_array() input arg format is weird and undocumented #15388

Closed

greglucas deleted the quadmesh_set_array branch July 7, 2020 22:35

Uh oh!

Adding 2d support to quadmesh set_array #16908

Adding 2d support to quadmesh set_array #16908

Uh oh!

Conversation

greglucas commented Mar 25, 2020

PR Summary

PR Checklist

Uh oh!

greglucas Mar 25, 2020

Choose a reason for hiding this comment

Uh oh!

jklymak Mar 27, 2020

Choose a reason for hiding this comment

Uh oh!

QuLogic commented Mar 26, 2020

Uh oh!

jklymak left a comment

Choose a reason for hiding this comment

Uh oh!

greglucas commented Mar 27, 2020

Uh oh!

greglucas commented Mar 29, 2020

Uh oh!

jklymak left a comment

Choose a reason for hiding this comment

Uh oh!

tacaswell commented Mar 30, 2020

Uh oh!

tacaswell Mar 30, 2020

Choose a reason for hiding this comment

Uh oh!

greglucas Mar 30, 2020

Choose a reason for hiding this comment

Uh oh!

tacaswell Mar 30, 2020

Choose a reason for hiding this comment

Uh oh!

greglucas Mar 30, 2020

Choose a reason for hiding this comment

Uh oh!

tacaswell Mar 30, 2020

Choose a reason for hiding this comment

Uh oh!

tacaswell commented Mar 30, 2020

Uh oh!

QuLogic commented Apr 16, 2020

Uh oh!

greglucas commented Apr 17, 2020

Uh oh!

tacaswell commented May 25, 2020

Uh oh!

greglucas commented May 25, 2020

Uh oh!

efiring commented May 25, 2020

Uh oh!

Uh oh!