-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Off-axes scatter() points unnecessarily saved to PDF when coloured #2488
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Closed as I believe that #2423 fixed this problem. |
I was under the impression from @mdboom that scatter() had a very different code path, and fixing this for scatter would be much more difficult than for plot(): Has scatter been tested? I should pull from git to test it out again, but I'm not quite feeling up for that right now... |
@mspacek Yeah, you are right, closed this erroneously. |
Yes -- scatter is so much more flexible -- each item can have its own transform, and the only way to determine (in the general case) if a patch is off the axes is to actually transform all of its points anyway, which probably doesn't result in terribly large savings (though I suppose one save the stroking time). If you pre-determine that all of the transformations scale/translation without rotation/skew, one could simply transform the bounding box of the patch to determine whether it's outside of the image, and this would probably be fast enough to be worth the effort. |
Sharing my results on 1.5.3 using TkAgg:
|
This issue has been marked "inactive" because it has been 365 days since the last comment. If this issue is still present in recent Matplotlib releases, or the feature request is still wanted, please leave a comment and this label will be removed. If there are no updates in another 30 days, this issue will be automatically closed, but you are free to re-open or create a new issue if needed. We value issue reports, and this procedure is meant to help us resurface and prioritize issues that have not been addressed yet, not make them disappear. Thanks for your help! |
This issue remains in matplotlib 3.7.1, Python 3.8, Qt5QAgg backend: import matplotlib.pyplot as plt
import numpy as np
x = np.random.random(20000)
y = np.random.random(20000)
c = np.random.random(20000)
plt.figure()
plt.scatter(x, y)
plt.savefig('scatter.pdf')
plt.xlim(2, 3) # move axes away for empty plot
plt.savefig('scatter_empty.pdf')
'''
file sizes in bytes:
scatter.pdf: 327682
scatter_empty.pdf: 6313
'''
plt.figure()
plt.scatter(x, y, c=c)
plt.savefig('scatter_color.pdf')
plt.xlim(2, 3) # move axes away for empty plot
plt.savefig('scatter_color_empty.pdf')
'''
file sizes in bytes:
scatter_color.pdf: 582991
scatter_color_empty.pdf: 583963 <--- should be much smaller
''' |
@tacaswell this should probably (unfortunately) be re-opened :) |
I'm going to re-open this and label it as "good first issue" but with medium difficulty. It is good first issue in that there is no API design choices to be made and two clear metrics to look at (the file size goes down in the special case and the run time does not go up (too much) in the general case). It is medium difficulty because this will likely require understanding the I think Mike's description in #2488 (comment) is still accurate. We do not know until the very (very) end if a given marker will be clipped or not. Concretely I see two places we might want to do this:
Without actually implementing both of them I do not have a good sense of which is the better approach. The exact work:
|
This issue has been marked "inactive" because it has been 365 days since the last comment. If this issue is still present in recent Matplotlib releases, or the feature request is still wanted, please leave a comment and this label will be removed. If there are no updates in another 30 days, this issue will be automatically closed, but you are free to re-open or create a new issue if needed. We value issue reports, and this procedure is meant to help us resurface and prioritize issues that have not been addressed yet, not make them disappear. Thanks for your help! |
Scatter plotting a bunch of points while specifying the colour of each point, then changing the axes limits so none of the points are visible, and then saving the result to a PDF, results in a file just as big as if the points were all visible within their default axes limits. This doesn't seem to happen if the colour arg isn't passed to
scatter()
. I haven't tried, but specifying other kinds of point specific attributes, like size, might also trigger the problem . Also, I haven't tried any of the other vector backends, but they may be affected as well.This came out of #2423.
Example code:
The text was updated successfully, but these errors were encountered: