Thanks to visit codestin.com
Credit goes to github.com

Skip to content

too large file size created by the errorbar of matplotlib #3345

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ellot opened this issue Aug 6, 2014 · 14 comments · Fixed by #3485
Closed

too large file size created by the errorbar of matplotlib #3345

ellot opened this issue Aug 6, 2014 · 14 comments · Fixed by #3485
Milestone

Comments

@ellot
Copy link

ellot commented Aug 6, 2014

I was tring to plot a lot of data with errors using errobar of matplotlib. The eps or pdf file is very large, much larger than created by IDL. I have tried to use rasterized=True, but it does not work. If I only use plt.plot(x,y) to plot the data without errors, it can create a much smaller file. I am using OSX 10.9 and Anaconda. Here is the example of my code

import numpy as np
import matplotlib.pyplot as plt
np.random.seed(0)
x = np.linspace(-5., 5., 5000)
y = 3 * np.exp(-0.5 * (x - 1.3)2 / 0.82)
y += np.random.normal(0., 0.2, x.shape)
f = plt.figure(0)
plt.errorbar(x,y, yerr=0.2,fmt='')
f.savefig('test.pdf')

@tacaswell
Copy link
Member

what does matplotlib.__version__ give?

@tacaswell tacaswell added this to the v1.4.x milestone Aug 6, 2014
@ellot
Copy link
Author

ellot commented Aug 14, 2014

I use matplotlib v1.3.1.
I have tried this in Fedora with matplotlib v1.3.1, it also has the same problem.
Is it fixed in higher version?

@jenshnielsen
Copy link
Member

Could you please give some numbers for reference?

I would suggest testing the Cairo backend as a work around:

import matplotlib
matplotlib.use('cairo')
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(0)
x = np.linspace(-5., 5., 5000)
y = 3 * np.exp(-0.5 * (x - 1.3)**2 / 0.8**2)
y += np.random.normal(0., 0.2, x.shape)
f = plt.figure(0)
plt.errorbar(x,y, yerr=0.2,fmt='')
f.savefig('test.pdf')

These are the results that I get from testing your plot (MPL 1.4.x branch). I don't have access to IDL so I have no clue about the sizes that it generates.

  • Errorbar 1.505 MB
  • Errorbar Cairo 186 KB
  • Regular Plot 59 KB
  • Regular Plot Cairo 33 KB

Another thing is that the errorbars in this plot makes no sense IMHO since they are all on top of each other. I would reduce the number of errorbars with the errorevery argument to make them separable. That will also likely reduce the file size.

@ellot
Copy link
Author

ellot commented Aug 14, 2014

The original backend is MacOSX, the file size is about 1.5 MB. If I create an eps file and then use ps2pdf, the pdf file size is about 232 KB.

@ellot
Copy link
Author

ellot commented Aug 14, 2014

I have tried use cairo backend with OSX 10.9, but I got an error
"Fatal Python error: PyThreadState_Get: no current thread
Abort trap: 6"

@jenshnielsen
Copy link
Member

How large is the IDL file?

You probably don't have the right dependencies for the cairo backend. You will need either cairocffi or pycairo installed and working but without some more details about how you installed matplotlib and python it is impossible to know. The cairo backend certainly works on OS X 10.9

@tacaswell
Copy link
Member

@jkseppan I have a hunch that is related to a bug that was reported to me verbally, but hasn't made it into the tracker yet.

When saving collections the pdf backend creates a new nested group (not sure if that is the correct pdf term) for every element instead of putting all of the elements in a single group.

I also suspect that this is related to the some what funny way we use graphic contexts.

@ellot
Copy link
Author

ellot commented Aug 15, 2014

IDL can not create PDF file directly. It can create an eps file, and then I use ps2pdf to get a PDF file about 180 KB.

@ellot
Copy link
Author

ellot commented Aug 15, 2014

I use Anaconda, the matplotlib is included in Anaconda.

@jenshnielsen
Copy link
Member

Thanks

It does't look like conda packs pycairo or cairocffi so that will not work right away.

@jkseppan
Copy link
Member

It seems that the errorbars get rendered using separate XObjects instead of just outputting the moveto and lineto operations in the page, and that blows up the file size.

@mdboom
Copy link
Member

mdboom commented Aug 25, 2014

@jkseppan: That makes sense. Markers (which include ticks) all use XObjects, which normally results in a reduction in file size when those markers are used frequently. But maybe ticks are simple enough that not using XObjects is actually better.

jkseppan added a commit to jkseppan/matplotlib that referenced this issue Sep 7, 2014
jkseppan added a commit to jkseppan/matplotlib that referenced this issue Sep 7, 2014
@jkseppan
Copy link
Member

jkseppan commented Sep 7, 2014

See #3485 for a proposed fix.

@jkseppan jkseppan added has_patch and removed status: needs clarification Issues that need more information to resolve. labels Sep 7, 2014
@jkseppan
Copy link
Member

jkseppan commented Sep 7, 2014

There could be a deeper problem somewhere. Why is draw_path_collection being called with collections of one path each? It seems that there is code in Collection.draw that is intended to call draw_markers for such collections.

jkseppan added a commit to jkseppan/matplotlib that referenced this issue Sep 21, 2014
The cost calculations are very rough, back-of-the-envelope.
In the example from matplotlib#3345 I get significant file size reductions:

	2.3M	before.svg
	1.8M	after.svg
	916K	before.ps
	672K	after.ps
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants