Facet grid FigureFactory #731

Kully · 2017-04-12T19:36:54Z

will be making a notebook for a proper showcase of these charts

…alidation

chriddyp · 2017-04-12T19:57:07Z

plotly/figure_factory/_facet_grid.py

+        num_of_rows = len(data[facet_row].unique())
+
+    if facet_col:
+        num_of_cols = len(data[facet_col].unique())


Thoughts about adding binning for numerical columns?

I wasn't thinking about it but I can after I get something working. I'm not finished with the code here, so I can ping you when I do

cldougl · 2017-04-12T20:09:10Z

plotly/figure_factory/__init__.py

@@ -16,3 +16,4 @@
 from plotly.figure_factory._table import create_table
 from plotly.figure_factory._trisurf import create_trisurf
 from plotly.figure_factory._violin import create_violin
+from plotly.figure_factory._facet_grid import create_facet_grid


imports in alphabetical order

jackparmer · 2017-04-13T17:42:55Z

How 1-1 is this figure factory with the ggplot2 facet geom? Would be great to see some screenshot output examples when this gets closer.

Kully · 2017-04-13T18:02:54Z

How 1-1 is this figure factory with the ggplot2 facet geom?

pretty close, just missing some variables right now. Yeah, I'm gonna put together either screenshots or a notebook presentation or both.

Kully · 2017-04-17T00:50:07Z

Progress:

Kully · 2017-04-19T03:59:47Z

@jackparmer Here's my progress so far.
@cldougl Ready for a review.
@chriddyp Here's a Jupyter Notebook of some examples!

Things that we could add:

binning for numerical columns (Chris' suggestion and exists for scatterplot_matrix FigureFactory)
-axis titles should be in middle of each axis

Feel free to make suggestions. 😄

jackparmer · 2017-04-20T19:59:42Z

@Kully This looks very cool! Can you please make a Jupyter notebook that uses these dataframes and recreates these examples using your FigureFactory?

http://ggplot2.tidyverse.org/reference/facet_grid.html

I think this will be a nice way to be sure we're covering canonical faceting usage.

Kully · 2017-04-20T20:19:44Z

Can you please make a Jupyter notebook that uses these dataframes and recreates these examples using your FigureFactory?

For sure!

cldougl · 2017-04-25T18:46:16Z

plotly/tests/test_optional/test_figure_factory.py

@@ -3,6 +3,7 @@
 from plotly.exceptions import PlotlyError

 import plotly.tools as tls
+import plotly.figure_factory as ff


thanks for updating all of these @Kully !

cldougl · 2017-04-25T18:47:30Z

plotly/tests/test_optional/test_figure_factory.py

                          **kwargs)

    def test_unequal_data_label_length(self):
        kwargs = {'hist_data': [[1, 2]], 'group_labels': ['group', 'group2']}
-        self.assertRaises(PlotlyError, tls.FigureFactory.create_distplot,
+        self.assertRaises(PlotlyError, ff.create_distplot,


same- one line?

cldougl · 2017-04-25T18:48:00Z

plotly/tests/test_optional/test_figure_factory.py

@@ -32,24 +33,24 @@ def test_wrong_histdata_format(self):
        # will fail)

        kwargs = {'hist_data': [1, 2, 3], 'group_labels': ['group']}
-        self.assertRaises(PlotlyError, tls.FigureFactory.create_distplot,
+        self.assertRaises(PlotlyError, ff.create_distplot,


looks like this could be one line now (with **kwargs)

cldougl · 2017-04-25T18:49:51Z

plotly/tests/test_optional/test_figure_factory.py

                          **kwargs)

        kwargs = {'hist_data': [[1, 2], [1, 2, 3]], 'group_labels': ['group']}
-        self.assertRaises(PlotlyError, tls.FigureFactory.create_distplot,
+        self.assertRaises(PlotlyError, ff.create_distplot,


same - one line?

cldougl · 2017-04-25T18:50:05Z

plotly/tests/test_optional/test_figure_factory.py

                          **kwargs)

    def test_simple_distplot_prob_density(self):

        # we should be able to create a single distplot with a simple dataset
        # and default kwargs

-        dp = tls.FigureFactory.create_distplot(hist_data=[[1, 2, 2, 3]],
+        dp = ff.create_distplot(hist_data=[[1, 2, 2, 3]],


fix spacing w/ lines below

cldougl · 2017-04-25T18:57:41Z

plotly/figure_factory/_facet_grid.py

+from plotly.figure_factory import utils
+import plotly.colors as colors
+
+from plotly.graph_objs import graph_objs


you can group all from plotly... imports together

cldougl · 2017-04-25T19:06:01Z

plotly/figure_factory/_facet_grid.py

+    )
+
+    return annotation_dict
+


you can add 2 new lines between defs

cldougl · 2017-04-25T19:06:53Z

plotly/figure_factory/_facet_grid.py

+    fig['layout']['annotations'] = annotations
+
+    return fig
+


2 spaces between defs

cldougl · 2017-04-25T19:10:15Z

plotly/figure_factory/_facet_grid.py

+                      colorscale=None, color_dict=None, title='facet grid',
+                      height=600, width=600, **kwargs):
+    """
+    Returns data for a facet grid.


this returns a figure right? not just the data object

changed 'data' -> 'figure'

cldougl · 2017-04-25T19:29:09Z

plotly/figure_factory/_facet_grid.py

+    """
+    if not pd:
+        raise exceptions.ImportError(
+            "'pandas' must be imported for this FigureFactory."


since we changed the syntax I would probably say figure_factory or figure type or chart type

cldougl · 2017-04-25T19:55:29Z

@Kully looks like a weird thing is going on when toggling the traces:

Kully · 2017-05-03T20:40:20Z

@Kully looks like a weird thing is going on when toggling the traces:

Yeah, it's pretty wack.

For some reason when a subplot is empty, the background just disappears. I think it has to do with all the customization I'm using.

Are you a fan of having all those duplicate trace icons, or is one per category enough.

chriddyp · 2017-06-01T19:50:20Z

plotly/figure_factory/_facet_grid.py

+        colormap will be treated as categorical (True) or sequential (False).
+            Default = False.
+    :param (bool) widen_frame: if set to True, all points in each subplot
+        are strickly contained in the region of the subplot by increasing the


spelling: strictly

chriddyp · 2017-06-01T19:51:29Z

plotly/figure_factory/_facet_grid.py

+    :param (bool) widen_frame: if set to True, all points in each subplot
+        are strickly contained in the region of the subplot by increasing the
+        maximum and minimum range values by 1. Setting to False doesn't do
+        anything.


maybe "If False, then points may be plotted on the edge of the frame."

chriddyp · 2017-06-01T19:52:04Z

plotly/figure_factory/_facet_grid.py

+        anything.
+        Default = False
+    :param (str|dict) facet_row_labels: set to either 'name' or a dictionary
+        of all the values in the facetting row mapped to some text to show up


spelling: faceting

and it's all of the "unique values" not just "values", right?

chriddyp · 2017-06-01T19:52:44Z

plotly/figure_factory/_facet_grid.py

+        Default = False
+    :param (str|dict) facet_row_labels: set to either 'name' or a dictionary
+        of all the values in the facetting row mapped to some text to show up
+        in the label annotations. If None, labelling works like usual.


spelling: labeling

chriddyp · 2017-06-01T19:53:59Z

plotly/figure_factory/_facet_grid.py

+    :param (int) width: the width of the facet grid figure.
+    :param (int) size: point size in pixels.
+    :param (str) trace_type: decides the type of plot to appear in the
+        facet grid. The options are 'scatter' and 'scattergl'.


although presumably you could pass in something like histogram and a set of kwargs to customize that histogram right?

we could even skip applying the marker and mode options if type is not in ['scatter', 'scattergl'], allowing users to create valid facet_grids with heatmaps, histograms, 2dhistograms, etc

In order for histogram to work, you'd need to also make x and y optional too

For this pass, can we not include trace types besides scatter and scattergl? mode for example is not a valid key for a bar chart, and would result in an error.

I don't think turning off validation for the figure is a good idea, but the way I'd implement the option to add other trace types is to have a dictionary of keys that would go in the trace dict for each specific trace

chriddyp

Looking pretty good so far! This is going to be really great, I can already see how I'll use this in Dash :) My main comments around around:

Numerical binning instead of using the unique numerical values
Making the code DRYer

I'll finish up reviewing this evening!

chriddyp · 2017-06-02T02:10:46Z

Some other thoughts as I play around with this:

I'm curious if it can be generalized a little bit to be able to plot any chart type, not just scatter. I find myself frequently wanting to make a faceted histogram, so this could be a big win. It seems like it could with a few small changes:
- making x and y optional: in a histogram you would only supply x (or, if you wanted horizontal histogram, y)
- skipping the marker assignments if the type isn't scatter or scattergl
Sizing I think that we should make the height a multiple of the number of rows and the width a mulitple of the number of columns. Something like height=min(600, 175 * n_rows). That way, if you have 20 unique values, your graph isn't 15px tall. If you have only two unique values, then we keep the default height (like 600 or w/e it is)
Text - We should add an optional text column_name that adds text to the grouped values

chriddyp · 2017-06-02T02:21:33Z

Is there a requirement that the graphs need to look like ggplot2 by default? I would prefer that we keep the styling more aligned to the plotly defaults to be consistent with the rest of the library. There are also a few things that bother me about the ggplot styling:

The points are pure black which is really stark. Without opacity, it's hard to tell if they overlap - here's an article about avoiding pure blacks in design: https://ianstormtaylor.com/design-tip-never-use-black/
The points are really small.
I personally find the grey background and grey background shapes kind of old looking
I think the tick placement much heavier than it needs to be

Some people like the theme however, and so we can keep it in with maybe a theme=ggplot2 argument.

Here is a little redesign:

Compared with:

Here is the code I'm using to convert the facet_grid styles:

gg = ff.create_facet_grid(mpg, 'cty', 'hwy', 'class')
p = copy.deepcopy(gg)
for trace in p['data']:
    # Overwrite marker styles: 
    # - slightly bigger than usual points
    # - opacity in the points
    # - a small border around the points
    trace['marker'] = {
        'color': 'rgba(31, 119, 180, 0.5)',
        'size': 8,
        'line': {'color': 'darkgrey', 'width': 1}
    }

# removing the dark grey background
del p['layout']['shapes']

# making the plot height porportional to the number of rows
p['layout']['height'] = len(p['data']) * 150


for k, v in p['layout'].iteritems():
    if 'axis' in k:
        # Reverting to default grid styles
        del v['gridcolor']
        del v['gridwidth']
        del v['tickfont']
        del v['tickwidth']
        del v['dtick']
        # Except the ticks: removing the ticks
        v['ticklen'] = 0

# Removing the grey background
del p['layout']['plot_bgcolor']

# Adding a slightly off-white margin background
# This makes it easier to distinguish one subplot from the other
p['layout']['paper_bgcolor'] = 'rgb(251, 251, 251)'

# Update hovering mode for scatter plots
p['layout']['hovermode'] = 'closest'

chriddyp · 2017-06-02T02:25:10Z

plotly/figure_factory/_facet_grid.py

+    kwargs.pop('marker', None)
+
+    # make sure dataframe index starts at 0
+    df.index = range(len(df))


I'm not sure we want to be doing this: this is going to be modifying the user's dataframe without them knowing.

For example:

Damn, you're right. I can fix that with some rewriting.

chriddyp · 2017-06-02T02:25:38Z

plotly/figure_factory/_facet_grid.py

+                if key not in facet_row_labels.keys():
+                    unique_keys = df[facet_row].unique().tolist()
+                    raise exceptions.PlotlyError(
+                        "If you are using a dictioanry for custom labels for "


spelling: "dictionary"

chriddyp · 2017-06-02T02:25:56Z

plotly/figure_factory/_facet_grid.py

+                if key not in facet_col_labels.keys():
+                    unique_keys = df[facet_col].unique().tolist()
+                    raise exceptions.PlotlyError(
+                        "If you are using a dictioanry for custom labels for "


spelling: "dictionary".

chriddyp · 2017-06-02T02:42:34Z

plotly/figure_factory/_facet_grid.py

+            max_range = math.ceil(max_range)
+            if widen_frame:
+                min_range -= 1
+                max_range += 1


Good call in fixing the range across all subplots so that it's easy to compare values.

I think we can be a bit smarter about setting the autorange though and then make it the default and remove this widen_frame argument:

Adding/subtracting 1 like this won't work if the numerical ranges are really big or if the numerical ranges are within a small range like between 0.5 and 0.6 (in which case it'll end up making the range too large)

Instead, why don't we just make it say 5% bigger than the absolute range. I believe this is how plotly.js does it. Something like: range = [min - (max - min) * 0.05, max + (max - min) * 0.05]. You could ask the #plotly_js folks exactly what value they use

Great idea Chris

chriddyp · 2017-06-02T02:43:57Z

plotly/figure_factory/utils.py

+    'Blackbody': ['rgb(0,0,0)', 'rgb(160,200,255)'],
+    'Earth': ['rgb(0,0,130)', 'rgb(255,255,255)'],
+    'Electric': ['rgb(0,0,0)', 'rgb(255,250,220)'],
+    'Viridis': ['#440154', '#fde725']


These colorscales seem a little off - aren't there supposed to be more than 2 colors in them?

Yeah that colorscale list is not correct. It's being used by _scatterplot.py for colorscales, so I eventually want to make a PR to replace the colorscales in utils with plotly.colors.PLOTLY_SCALES (the proper one) and rewrite the way scatterplot handles colorscales.

Will handle in a separate PR with a new issue: #769

Kully · 2017-06-08T19:06:01Z

The points are pure black which is really stark. Without opacity, it's hard to tell if they overlap - here's an article about avoiding pure blacks in design: https://ianstormtaylor.com/design-tip-never-use-black/

Good read. It's funny because the "black" that they pit against the examples of nearly-black in app layouts are actually not #000000 either. The website may be swiping a filter over the whole page

Kully · 2017-06-09T22:31:51Z

@jackparmer @chriddyp @cldougl
A few words about this PR. I want to get this thing merged ASAP (today if possible).

Can one of you do a quick review of the facet grid? CHANGELOG and version number are already bumped the next pip package.
I want to move the remaining things/ideas for this figure factory - numerical binning/custom cuts for binning/other trace-type support - to another PR, as they are not canonical features.

Sounds good?

jackparmer · 2017-06-10T03:49:42Z

I want to move the remaining things/ideas for this figure factory - numerical binning/custom cuts for binning/other trace-type support - to another PR, as they are not canonical features.

SGTM!

… facet_grid Going to merge because I feel like it

Kully · 2017-06-10T17:49:12Z

@jackparmer @chriddyp 💃 ?

chriddyp · 2017-06-12T15:17:28Z

numerical binning/custom cuts for binning to another PR

Yeah, that sounds good. In the meantime, we can document a workflow where the user creates their own bins in their dataframe and facets off of that. For example:

>> df = pd.DataFrame({'x': [1, 2, 3, 4, 1, 2, 3, 4], 'y': ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']})
>> df['group'] = pd.cut(df['x'], 5, ['first group', 'second group', 'third group', 'fourth group', 'fifth group'])
>> df
   x  y         group
0  1  a  (0.997, 1.6]
1  2  b    (1.6, 2.2]
2  3  c    (2.8, 3.4]
3  4  d      (3.4, 4]
4  1  e  (0.997, 1.6]
5  2  f    (1.6, 2.2]
6  3  g    (2.8, 3.4]
7  4  h      (3.4, 4]

or, with custom labels:

>> df['group'] = pd.cut(df['x'], 5, labels=['first group', 'second group', 'third group', 'fourth group', 'fifth group'])
>> df
   x  y         group
0  1  a   first group
1  2  b  second group
2  3  c  fourth group
3  4  d   fifth group
4  1  e   first group
5  2  f  second group
6  3  g  fourth group
7  4  h   fifth group

chriddyp · 2017-06-12T15:18:58Z

thanks for making all of those changes. this feature is really great now. 💃 !

Kully added 2 commits April 12, 2017 15:24

most of the work on facet_grid sans color variable capabilities

18a91aa

duplicated validation functions to utils from colors for colorscale v…

78a9bab

…alidation

Kully changed the title ~~Facet grid~~ Facet grid FigureFactory Apr 12, 2017

change PLOTLY_SCALES to output proper colorscale structure from utils

6bdb56e

chriddyp reviewed Apr 12, 2017

View reviewed changes

cldougl reviewed Apr 12, 2017

View reviewed changes

Kully added 5 commits April 17, 2017 22:28

add **kwargs to facet_grid

dc94878

start tests for facet_grid

a7503a9

finished tests for facet_grid

c0a5827

reverted utils ver of PLOTLY_SCALES to just 2-item list

d5643c6

added TODO line (for another issue in plotly.py)

baf7b63

cldougl reviewed Apr 25, 2017

View reviewed changes

fix syntax comments in test_figure_factory

c10316d

Kully added 9 commits May 3, 2017 16:58

chelsea comments

6336f63

add colormap description in colors.py and working on facet_grid

07ae3f1

color to color_name and colormap is the only color variable

93e96d9

update schema

154a178

some more changes

16fa00c

added static range attempt

fefbe4c

axis ranges for facetting row/col are identical

0249874

fix same-range bugs

13f51a7

added legend factor title for coloring

aad3b71

chriddyp reviewed Jun 1, 2017

View reviewed changes

chriddyp reviewed Jun 2, 2017

View reviewed changes

Kully added 6 commits June 9, 2017 09:47

added ggplot2 mode

d7fdd3f

updated tests

e33ed1b

move endpts_to_intervals to utils from _scatterplot.py

b329fb8

update changelog

889199e

version bump

8347650

Merge branch 'master' into facet_grid

ec33221

Kully added 3 commits June 10, 2017 12:28

added complete working tests

4d320b9

merge it

d35abf6

Merge branch 'facet_grid' of https://github.com/plotly/plotly.py into…

036c2fa

… facet_grid Going to merge because I feel like it

Kully merged commit f52c5db into master Jun 12, 2017

Kully deleted the facet_grid branch June 12, 2017 15:20

Uh oh!

Facet grid FigureFactory #731

Facet grid FigureFactory #731

Uh oh!

Conversation

Kully commented Apr 12, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jackparmer commented Apr 13, 2017

Uh oh!

Kully commented Apr 13, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Kully commented Apr 17, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Kully commented Apr 19, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jackparmer commented Apr 20, 2017

Uh oh!

Kully commented Apr 20, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cldougl commented Apr 25, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Kully commented May 3, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Kully Jun 5, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chriddyp left a comment

Choose a reason for hiding this comment

Kully commented Apr 12, 2017 •

edited

Loading

Kully commented Apr 13, 2017 •

edited

Loading

Kully commented Apr 17, 2017 •

edited

Loading

Kully commented Apr 19, 2017 •

edited

Loading

cldougl commented Apr 25, 2017 •

edited

Loading

Kully Jun 5, 2017 •

edited

Loading

Kully Jun 2, 2017 •

edited

Loading