Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Adds option to plot average in boxplot, besides the median #2520

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6 commits into from

Conversation

miggaiowski
Copy link

I wrote this as I needed it and matplotlib didn't support it.

Please let me know if it looks good or if anything needs improvement.

I've attached an example of how it looks.
united_states_averages

@@ -2568,7 +2568,7 @@ def broken_barh(xranges, yrange, hold=None, **kwargs):
@_autogen_docstring(Axes.boxplot)
def boxplot(x, notch=False, sym='b+', vert=True, whis=1.5, positions=None,
widths=None, patch_artist=False, bootstrap=None, usermedians=None,
conf_intervals=None, hold=None):
conf_intervals=None, averages=False, useraverages=None, hold=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

being pedantic about reverse compatibility, I think you should add these after hold=None so anyone using all of the positional arguments won't be surprised

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, just fixed it!

@@ -2916,7 +2930,7 @@ def computeConfInterval(data, med, iq, bootstrap):
if not self._hold:
self.cla()
holdStatus = self._hold
whiskers, caps, boxes, medians, fliers = [], [], [], [], []
whiskers, caps, boxes, medians, average_values, fliers = [], [], [], [], [], []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that the travis failure is coming from this line, can you re-arrange this so it is < 80chr per line?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, just fixed it. Thanks

if useraverages is not None:
if useraverages[i] is not None:
avg = useraverages[i]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a minor point, but would it make more sense to re-arrange this logic so that the average is only computed if needed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, diff on the way.

On Tue, Oct 15, 2013 at 12:25 PM, Thomas A Caswell <[email protected]

wrote:

In lib/matplotlib/axes/_axes.py:

@@ -2996,6 +3024,15 @@ def computeConfInterval(data, med, iq, bootstrap):
if usermedians[i] is not None:
med = usermedians[i]

  •        if averages:
    
  •            # get average
    
  •            avg = np.average(d)
    
  •            # replace with input averages if available
    
  •            if useraverages is not None:
    
  •                if useraverages[i] is not None:
    
  •                    avg = useraverages[i]
    

This is a minor point, but would it make more sense to re-arrange this
logic so that the average is only computed if needed?


Reply to this email directly or view it on GitHubhttps://github.com//pull/2520/files#r6974227
.

Miguel Gaiowski

avg = np.average(d)



Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pep8 is going to complain about this white space

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops

@miggaiowski
Copy link
Author

Is there anything else I need to do?

@mdboom
Copy link
Member

mdboom commented Oct 16, 2013

This looks quite good. It needs a regression test.

@pelson, @efiring: You guys look like the most recent major editors of the boxplot code. Any comments?

@pelson
Copy link
Member

pelson commented Oct 18, 2013

Looks good, though I'd like to see an existing test updated to include this functionality.

@miggaiowski
Copy link
Author

I tried writing another boxplot test at matplotlib/lib/matplotlib/tests/test_axes.py but many tests are failing on my machine. The simple test_boxplot() test fails, and looking at the images the difference seems to be the font of the labels. Does that have something to do with Mac vs Linux?

I could just add another test function like this:

@image_comparison(baseline_images=['boxplot_with_averages'],
                  extensions=['pdf', 'png'])
def test_boxplot_with_averages():
    x = np.linspace(-7, 7, 140)
    x = np.hstack([-25, x, 25])
    fig = plt.figure()
    ax = fig.add_subplot(111)

    # show 1 boxplot with mpl medians/conf. intervals, 1 with manual values
    ax.boxplot([x, x], bootstrap=10000, usermedians=[None, 1.0],
               conf_intervals=[None, (-1.0, 3.5)], averages=True)
    ax.set_ylim((-30, 30))

But first I'd need help figuring out how to run this test and get the previous test to pass.

@tacaswell
Copy link
Member

@miggaiowski What OS are you using?

@miggaiowski
Copy link
Author

Mac OS X.

Miguel Gaiowski

On Oct 26, 2013, at 8:13 PM, Thomas A Caswell [email protected] wrote:

@miggaiowski What OS are you using?


Reply to this email directly or view it on GitHub.

@miggaiowski
Copy link
Author

It's been a while, can we get this in without the tests?

I've been using it just fine, along with a colleague.

@tacaswell
Copy link
Member

It no longer merges cleanly so it will need a re-base.

The unit-tests are not because we don't believe it doesn't work now, but so that in the future a seemingly unrelated changes does not break it.

An array or sequence whose first dimension (or length) is
compatible with *x*. This overrides the averages computed by
matplotlib for each element of *useraverages* that is not None.
When an element of *useraverages* == None, the median will be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mean?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch =)

@phobson
Copy link
Member

phobson commented Nov 26, 2013

@miggaiowski

I think this is a good addition. If you and core team can hold off, over the US Thanksgiving holiday, I plan to submit a PR that will split the boxplot into three functions: _compute_boxplot, _draw_boxplot, and then an existing API-compliant boxplot that relies on the other two methods.

The idea here is that _compute_boxplot will return a dict of possible values to draw (e.g., confidence intervals, means), and _draw_boxplot will accept that dictionary blindly and make it happen. Then end result being that users can easily define their own dictionary and go straight to _draw_boxplot for advanced functionality.

The main advantage here is that there are a ton of different ways to render a boxplot[1] and the API is getting pretty bloated.

[1] e.g., I have 1 project manager who insists on putting the caps at the 5th and 95th percentiles, and another at the max and min

First pass at list of keys that will be in that dictionary:

  1. Medians
  2. Confidence intervals around the median
  3. Quartiles 1 and 3
  4. Whisker top and bottom (caps)
  5. outliers (fliers)
  6. arithmetic means (averages)
  7. anything else?

Another advantage to this approach is that the _draw_boxplot method should easily accommodate a style kwarg. That way people can select different ways to render the same values. (see http://nbviewer.ipython.org/5432378 for Tufte's take on boxplots)

@miggaiowski
Copy link
Author

@phobson

Sounds good!
We can hold if off and get it done when you finish your refactoring.

Thanks!

@miggaiowski miggaiowski reopened this Nov 26, 2013
@miggaiowski
Copy link
Author

Should I close this or leave it open? =)

@tacaswell
Copy link
Member

@miggaiowski Is there anything is this PR that is not include in #2643?

@miggaiowski
Copy link
Author

I think it has everything.

On Sun, Jan 5, 2014 at 3:07 PM, Thomas A Caswell
[email protected]:

@miggaiowski https://github.com/miggaiowski Is there anything is this
PR that is not include in #2643#2643
?


Reply to this email directly or view it on GitHubhttps://github.com//pull/2520#issuecomment-31618735
.

Miguel Gaiowski

@tacaswell
Copy link
Member

ok, then I am going to close this PR.

@tacaswell tacaswell closed this Jan 6, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants