Scatter autoscaling still has issues with log scaling and zero values #16552

moloney · 2020-02-18T19:51:01Z

I installed the 3.2.0rc3 package, and I can confirm that this fixes almost all the issues with autoscaling and scatter. However, if you have points with 0 for the y-axis and enable log scaling there is still an issue with the scatter plot auto scaling (that doesn't show up with the standard plot function).

import itertools

import numpy as np
from matplotlib import pyplot as plt


x_vals = [4.38462e-06,
          5.54929e-06,
          7.02332e-06,
          8.88889e-06,
          1.125e-05,
          1.42383e-05,
          1.80203e-05,
          2.2807e-05,
          2.88651e-05,
          3.65324e-05,
          4.62363e-05,
          5.85178e-05,
          7.40616e-05,
          9.37342e-05,
          0.000118632,
          ]

y_vals = [0.0,
          0.10000000000000002,
          0.182,
          0.332,
          0.604,
          1.1,
          2.0,
          3.64,
          6.64,
          12.100000000000001,
          22.0,
          39.60000000000001,
          71.3,
          ]

pts = np.array(list(itertools.product(x_vals, y_vals)))

fig = plt.figure("Scatter plot")
ax = fig.gca()
ax.set_xscale('log')
ax.set_yscale('log')
ax.scatter(pts[:,0], pts[:,1]) # This only shows four rows of points

fig = plt.figure("Regular plot")
ax = fig.gca()
ax.set_xscale('log')
ax.set_yscale('log')
ax.plot(pts[:,0], pts[:,1], marker="o", ls="") # This works

plt.show()

Originally posted by @moloney in #6015 (comment)

The text was updated successfully, but these errors were encountered:

JadinLuong · 2020-03-04T03:13:25Z

Hey, I'm new to matplotlib and I am willing to contribute. Would this be a good first issue? If so what are some first steps in tackling this issue?

timhoffm · 2020-03-06T01:16:45Z

@JadinLuong yes, this is suitable as a first issue.

First steps:

Run the above code example and verify that plot() yields a reasonable result while scatter() does not. - Actually what does it do with the 0 values? Is that reasonable?
Dig into the functions and find out why they behave differently.
Write a test for scatter() that checks the desired scaling behavior and thus currently fails.
Adapt scatter() so that it behaves like plot() with respect to scaling and now passes the above test.
Create ar PR with your code.

If you have technical trouble or want to learn the development workflow and conventions for Matplotlib, check the developers guide. Since this is quite lengthy, you may also ask if you have a particular question.

QuLogic · 2020-05-09T05:15:52Z

The problem here is that plot adds a Line2D, whereas scatter adds a PathCollection. In the former case, the Axes.dataLim is updated in Axes.add_line using Line2D.get_path(). In the latter case, the Axes.dataLim is updated in Axes.add_collection using Collection.get_datalim().

Updating using the Path means it will check every single point, while updating using the collection's data limits means it will only check the minimum x/y and the maximum x/y. This actually results in the same data limits, but on a log scale, the y=0 point will be ignored, so it produces a different minposy. In the latter case, there are only two points, so minposy is actually set to the maximum y.

On log scales, any view limit <= 0 is set to minposy. For plot, that's just the next value up (0.1), but for scatter, it's the maximum (71.3). You only see something in the end because the nonsingular check sees that vmin==vmax and picks 'nice' values around it.

To fix this, there would need to be some change in add_collection to actually set the right minpos? values. For scatter, it could grab the actual path and update dataLim using it, but I don't think that's true of all collections, and a bit of a hack fix.

QuLogic · 2020-05-09T05:27:36Z

Collection.get_datalim returns a Bbox, so we do have somewhere to store minpos? without complicated API changes. However, Axes.add_collection converts that into a Path to update the dataLim, throwing away any of that.

So this would need some coordination to say that Collection.get_datalim should fill in minpos? and that Bbox.update_from_data_xy (or probably a new function) should merge that information.

This is perhaps not difficult, but might require some carefulness and thorough testing, so I'm not too sure it'll be fixed for 3.2.2.

QuLogic · 2020-10-03T00:00:50Z

Surprisingly, this works and does not break any tests:

diff --git a/lib/matplotlib/collections.py b/lib/matplotlib/collections.py
index 51c6c50a03..e785ceb462 100644
--- a/lib/matplotlib/collections.py
+++ b/lib/matplotlib/collections.py
@@ -290,9 +290,7 @@ class Collection(artist.Artist, cm.ScalarMappable):
                 # note A-B means A B^{-1}
                 offsets = np.ma.masked_invalid(offsets)
                 if not offsets.mask.all():
-                    points = np.row_stack((offsets.min(axis=0),
-                                           offsets.max(axis=0)))
-                    return transforms.Bbox(points)
+                    return offsets
         return transforms.Bbox.null()
 
     def get_window_extent(self, renderer):

It just happens to work because Axes.update_datalim uses Bbox.update_from_data_xy I mentioned, and Axes.add_collection passes the above result directly to it. However, it's not really a great patch, as then Collection.get_datalim sometimes returns an array of points, and sometimes a bbox. And it doesn't handle the other cases in Collection.get_datalim.

If we added a second method that returned the points, while Collection.get_datalim returned the bbox from it, then we could have add_collection call this new method and update the Axes dataLim with the full data.

This test is a distilled out of matplotlib#16552.

petor-traffs · 2020-12-02T23:42:38Z

Hello!! Is this issue still open? I'd like to try it!

dopplershift · 2020-12-04T20:48:09Z

Looks like #18642 has been opened to try to fix this one. Might not be a good starting point.

tacaswell added this to the v3.2.1 milestone Feb 18, 2020

vincentt117 added a commit to CSCD01/matplotlib-team28 that referenced this issue Mar 10, 2020

Added unit tests for bug fix for matplotlib#16552

f58bdfd

vincentt117 added a commit to CSCD01/matplotlib-team28 that referenced this issue Mar 10, 2020

Fixed missing headers in tests for bug fix for matplotlib#16552

2d99501

tacaswell modified the milestones: v3.2.1, v3.2.2 Mar 16, 2020

QuLogic modified the milestones: v3.2.2, v3.4.0 May 9, 2020

QuLogic mentioned this issue Oct 2, 2020

Log scale scatter limits incorrect if zero value present in data #18630

Closed

QuLogic added a commit to QuLogic/matplotlib that referenced this issue Oct 3, 2020

Add a test for scatter autolim on log scale.

9265a1c

This test is a distilled out of matplotlib#16552.

QuLogic mentioned this issue Oct 3, 2020

Propagate minpos from Collections to Axes.datalim #18642

Merged

7 tasks

QuLogic added a commit to QuLogic/matplotlib that referenced this issue Oct 7, 2020

Add a test for scatter autolim on log scale.

a73c862

This test is a distilled out of matplotlib#16552.

QuLogic added a commit to QuLogic/matplotlib that referenced this issue Oct 9, 2020

Add a test for scatter autolim on log scale.

f9978ab

This test is a distilled out of matplotlib#16552.

QuLogic added a commit to QuLogic/matplotlib that referenced this issue Oct 16, 2020

Add a test for scatter autolim on log scale.

279ec45

This test is a distilled out of matplotlib#16552.

jklymak closed this as completed in #18642 Jan 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Scatter autoscaling still has issues with log scaling and zero values #16552

Scatter autoscaling still has issues with log scaling and zero values #16552

moloney commented Feb 18, 2020

JadinLuong commented Mar 4, 2020

Uh oh!

timhoffm commented Mar 6, 2020

Uh oh!

QuLogic commented May 9, 2020

Uh oh!

QuLogic commented May 9, 2020

Uh oh!

QuLogic commented Oct 3, 2020 •

edited

Loading

Uh oh!

petor-traffs commented Dec 2, 2020

Uh oh!

dopplershift commented Dec 4, 2020

Uh oh!

Uh oh!

Scatter autoscaling still has issues with log scaling and zero values #16552

Scatter autoscaling still has issues with log scaling and zero values #16552

Comments

moloney commented Feb 18, 2020

JadinLuong commented Mar 4, 2020

Uh oh!

timhoffm commented Mar 6, 2020

Uh oh!

QuLogic commented May 9, 2020

Uh oh!

QuLogic commented May 9, 2020

Uh oh!

QuLogic commented Oct 3, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

petor-traffs commented Dec 2, 2020

Uh oh!

dopplershift commented Dec 4, 2020

Uh oh!

QuLogic commented Oct 3, 2020 •

edited

Loading