Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

TLouf
Copy link
Contributor

@TLouf TLouf commented Jul 18, 2021

fixes #1269 #1379

Here's a proposal to handle more easily categorical plots and their legends. The main idea is to give the proper label to each collection that we plot, so that the legend can be created automatically by calling ax.legend without manually creating handles and labels. This allows lines to be represented by lines, points by points and polygons by squares in the legend directly. Now some may prefer to have each geometry type in its own legend, but I went for the simplest here.

When provided with a color that maps values to colors as suggested in the issue, we treat it like a categorical plot with a custom colormap, which makes sense I think. I imagined a user would intuitively try to do the same with all kinds of styling arguments, so I also tried to implement that.

Here are two example plots:

from shapely.geometry import LineString
import geopandas as geopd

linea = LineString([(1, 1), (2, 2), (3, 2), (5, 3)])
lineb = LineString([(3, 4), (5, 7), (12, 2), (10, 5), (9, 7.5)])
linec = LineString([(3, 3), (5,5), (9, 7.5)])
gdf_lines = geopd.GeoDataFrame(
    [1, 2, None], geometry=[linea, lineb, linec], crs="epsg:4326"
 )
gdf_lines["type"] = ["M", "S", None]
gdf_lines.plot(column='type', color={'M': 'black', 'S': 'r'}, legend=True,
               linewidth=[1, 4], missing_kwds={'color': 'gray'})

Screenshot from 2021-07-18 19-18-34

gdf_lines.plot(column='type', categorical=True, legend=True, linewidth={'S': 4, 'M': 2})

Screenshot from 2021-07-18 19-19-59

This is a WIP as I've only adapted _plot_linestring_collection for now. Also, I quickly fixed the usage of "scheme" but didn't test thoroughly yet, and there are still some broken edge cases, in particular when the style mappings provided to plot exclude some of the present categories.

Anyway I submitted at this stage to get feedback on the approach, so let me know what you think!

@martinfleis
Copy link
Member

@TLouf Thanks for looking into it. I like this approach a lot!

If you can finish support for other geom types, add tests, and document this in the User Gude this would be a super valuable contribution!

@TLouf TLouf marked this pull request as draft August 11, 2021 10:36
@TLouf
Copy link
Contributor Author

TLouf commented Aug 11, 2021

Just extended to other geometry types. Not as easy as I thought initially. In particular, for polygons the default PatchCollection handler (the class that generates the handles in the legend) does not work properly, so I have to map to PatchCollection the handler HandlerPolyCollection (see the ax.legend wrapper _legend_with_poly_wrapper I defined). At first I had just added this "handler_map" to legend_kwds, but I figured some users might want to call ax.legend after calling plot, so this wrapper enables them to do so and to obtain the same result as if they had called plot with legend=True.

Anyway I'm quite satisfied with the kind of results I get

from shapely.geometry import LineString, Point, Polygon
import  shapely.wkt as wkt
import geopandas as gpd

linea = LineString([(1, 1), (2, 2), (3, 2), (5, 3)])
lineb = LineString([(3, 4), (5, 7), (12, 2), (10, 5), (9, 7.5)])
linec = LineString([(3, 3), (5,5), (9, 7.5)])
pointa = Point(2, 1.7)
pointb = Point(3, 4)
lined = wkt.loads('LINESTRING EMPTY')
gdf = gpd.GeoDataFrame(
    {'num': [1, 8, 2, 5, 1, 5], 'type': ["M", "S", None, "S", "M", "S"]},
    geometry=[Polygon(linea), lined, lineb, linec, pointa, pointb], crs="epsg:4326"
 )

gdf.plot(
    column='type', legend=True, edgecolor={'M': 'r', 'S': 'k'},
    hatch={'M': 'xx', 'S': '//'}, alpha={'M': 0.5, 'S': 1},
    color={'M': 'b', 'S': 'r'}, marker={'M': '+', 'S': '+'},
    missing_kwds={'color': 'gray'}, linewidth=[1, 4, 1, 5, 1, 1]
)

lol

I didn't consider the implications to scheme for now, as it's probably better for me to wait that #2019 is merged before fixing anything. I'll get onto writing additional tests next. About that, how do I select which keywords to test on? I thought maybe a problematic one (like 'marker', which can only be passed as a single str), and another, like linewidth. Or should I cover more of them?

@martinfleis
Copy link
Member

martinfleis commented Aug 14, 2021

Nice job!

I didn't consider the implications to scheme for now, as it's probably better for me to wait that #2019 is merged before fixing anything.

Makes sense.

how do I select which keywords to test on?

Just pick the most sensible to cover all geometry tupes. So marker, linewidth and hatch or facecolor? It is up to you.

(I'll do the code review later)

@TLouf
Copy link
Contributor Author

TLouf commented Sep 26, 2021

I've rebased onto master to integrate #2019 and I get your new test failing on the legend part @martinfleis. For instance I get the following for the first test plot:

image

So only appears in the legend what's actually plotted, because the legend is created automatically. IMO this makes sense, but I never actually use scheme so maybe my opinion's not very aligned with the one of a typical user. If we want to keep the exact same behaviour as before, the only option I see for now is to manually create the legend only when using scheme, but that's quite dirty.

The two other tests which are failing are because I took out all the norm building logic out of _plot_polygon_collection and so on to simplify the code, and these tests test whether passing vmin and vmax to them works correctly. However since these functions are internal, I don't know if it's worth keeping that behaviour, what do you think?

By the way I also enabled the dictionary style keyword arguments to not contain all categories, yielding the default style for the geometries which are not part of the given categories. In practice this enables to do this:

ax = gdf.plot(column='type', legend=True, marker={'M': '+'} with the setup above and get:

image

@TLouf
Copy link
Contributor Author

TLouf commented Oct 1, 2021

Actually I've been thinking, it would make more sense to me if the legend of a plot with a classifying scheme was a discrete colorbar, rather than a legend as it is now, which I find a bit ugly and hard to read. This also makes more sense from the point of view of matplotlib, as labelling objects which are not plotted in a legend is not natural, but it is more so in a colorbar. This should probably be discussed in a separate issue, and would entail a big change of behaviour facing the user, so I don't know if it should be changed, but I wanted to know if that made sense to anyone other than me.

@martinfleis
Copy link
Member

it would make more sense to me if the legend of a plot with a classifying scheme was a discrete colorbar

Generally, yes. But we should allow both options because in some cases, the colorbar is not fully legible. With explore we currently support both options so you can pick the one that works better for your use case. I think that in the ideal case, we would support both in static plotting as well. But yeah, that is for another issue (feel free to move these to a new one).

Screenshot 2021-10-01 at 10 37 05

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make it easier to create categorical plots with custom colors
2 participants