ENH: Use png predictors when compressing images in pdf files #4605

jkseppan · 2015-07-08T15:27:34Z

This should reduce pdf file sizes when they include large raster images.

This code is failing at least the test I recently added to check that the alpha channel is output for grayscale images, so there must be some bug lurking in there.

It used to work only with RGBA data. Also ensure that the input is contiguous, since it sends row pointers to libpng, which reads each row directly from memory instead of using numpy methods. Improve the wording of a rarely occurring error message: if setjmp returns nonzero, it's not an error in setjmp, it's the result of a longjmp call.

The PDF format allows for PNG predictors in Flate-compressed streams, so we can use libpng to encode a png file and extract the row data from that.

WeatherGod · 2015-07-08T16:07:59Z

lib/matplotlib/backends/backend_pdf.py

-        rgb = rgba[:, :, :3].tostring()
-        a = rgba[:, :, 3]
-        if np.all(a == 255):
+        rgb = np.ascontiguousarray(rgba[:, :, :3])


when did this function become available in numpy?

I'm not sure, but it just basically calls array(..., order='C') with some additional arguments that don't matter here, so I used that instead.

Not sure when ascontiguousarray was introduced, order='C' should work in any version of numpy. Combine the very similar _gray and _rgb methods into one, and add some docstrings.

jkseppan · 2015-07-09T06:21:30Z

lib/matplotlib/backends/backend_pdf.py

+                                        else 'DeviceRGB'),
+               'BitsPerComponent': 8}
+        if smask:
+            obj['SMask'] = smask


The bug in my initial version was that I used 'Smask' instead of 'SMask' here.

Apparently needed in Python 2.6.

WeatherGod · 2015-07-09T14:16:22Z

lib/matplotlib/backends/backend_pdf.py

-            alpha = None
+            alpha = np.array(alpha, order='C')
+        if im.is_grayscale:
+            r, g, b = rgb.astype(np.float32).transpose(2, 0, 1)


just as a sanity check, when did this form of transpose() become available in numpy? It might have always been there, but I want to double-check.

On 09 Jul 2015, at 17:16, Benjamin Root [email protected] wrote:

if im.is_grayscale:

r, g, b = rgb.astype(np.float32).transpose(2, 0, 1)

just as a sanity check, when did this form of transpose() become available in numpy? It might have always been there, but I want to double-check.

Not sure when exactly, but it’s documented in the 2006 copy of numpybook.pdf, which mentions version 1.0.2.dev3478. We support numpy 1.6 and up. One of the Travis builders uses Python 2.6 with numpy 1.6, and this code does get exercised by some of the tests, as evidenced by the tests failing before I made the two last commits to make the code compatible with Python 2.6.

Ah, didn't realize one of our Travis instances was using np1.6. Carry on.

ENH: Use png predictors when compressing images in pdf files

WeatherGod · 2015-07-22T15:59:50Z

lib/matplotlib/backends/backend_pdf.py

+        written = 0
+        header = bytearray(8)
+        while True:
+            n = buffer.readinto(header)


This line is now causing failures in python2.6 and python2.7 image tests on master. The error message says that the buffer does not have a method "readinto()".

That is because the merge of the six branch (#4501) changed it from

from io import BytesIO

to

from matplotlib.externals.six import BytesIO

six.BytesIO is stringIO.stringIO in python 2

while #4603 changed it to use BytesIO all over.

I.e I think the merge in 8fe495a is wrong

PR coming up

╯‵Д′)╯彡┻━┻

jkseppan added 2 commits July 8, 2015 18:11

Compress raster images in PDF output using libpng

60161fa

The PDF format allows for PNG predictors in Flate-compressed streams, so we can use libpng to encode a png file and extract the row data from that.

tacaswell added the status: needs review label Jul 8, 2015

WeatherGod reviewed Jul 8, 2015
View reviewed changes

jkseppan added 2 commits July 8, 2015 23:16

Fix typo

8f6706d

Use array(..., order='C') instead of ascontiguousarray

59deb8c

Not sure when ascontiguousarray was introduced, order='C' should work in any version of numpy. Combine the very similar _gray and _rgb methods into one, and add some docstrings.

jkseppan reviewed Jul 9, 2015
View reviewed changes

jkseppan added 2 commits July 9, 2015 10:05

Use explicit bytestring with struct.unpack

5085af7

Apparently needed in Python 2.6.

More Python 2.6 compatibility

9c2cf4c

WeatherGod reviewed Jul 9, 2015
View reviewed changes

jkseppan added 2 commits July 9, 2015 17:26

Improve docstrings

3cd073f

Delete an extra word in an error message

18dcc54

tacaswell modified the milestone: next point release Jul 17, 2015

tacaswell added a commit that referenced this pull request Jul 22, 2015

Merge pull request #4605 from jkseppan/png-in-pdf

336c1bb

ENH: Use png predictors when compressing images in pdf files

tacaswell merged commit 336c1bb into matplotlib:master Jul 22, 2015

tacaswell removed the status: needs review label Jul 22, 2015

WeatherGod reviewed Jul 22, 2015
View reviewed changes

jenshnielsen mentioned this pull request Jul 22, 2015

Use BytesIO from io. #4757

Merged

jkseppan deleted the png-in-pdf branch July 22, 2015 17:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH: Use png predictors when compressing images in pdf files #4605

ENH: Use png predictors when compressing images in pdf files #4605

Uh oh!

jkseppan commented Jul 8, 2015

Uh oh!

WeatherGod Jul 8, 2015

Uh oh!

jkseppan Jul 9, 2015

Uh oh!

jkseppan Jul 9, 2015

Uh oh!

WeatherGod Jul 9, 2015

Uh oh!

jkseppan Jul 9, 2015

Uh oh!

WeatherGod Jul 9, 2015

Uh oh!

WeatherGod Jul 22, 2015

Uh oh!

jenshnielsen Jul 22, 2015

Uh oh!

jenshnielsen Jul 22, 2015

Uh oh!

WeatherGod Jul 22, 2015

Uh oh!

Uh oh!

Uh oh!

ENH: Use png predictors when compressing images in pdf files #4605

ENH: Use png predictors when compressing images in pdf files #4605

Uh oh!

Conversation

jkseppan commented Jul 8, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!