Thanks to visit codestin.com
Credit goes to github.com

Skip to content

PDF file generation is not deterministic - results in different outputs on the same input #6317

Closed
@drufat

Description

@drufat

Suppose you want to generate a pdf file with matplotlib and save it.

genfigure.py:

import matplotlib.pyplot as plt
import sys
plt.plot([0, 1], [0, 1])
plt.savefig(sys.argv[1])

Run the script from the command line

$ python genfigure.py 1.pdf
$ python genfigure.py 2.pdf

Given that we are saving the same figure, we would expect the output to be the same. However, after looking at the file hashes, they appear to be different. In my particular case:

$ md5sum 1.pdf 2.pdf
e54cdbd65a6baaa5152d90743d800039  1.pdf
4b0ac2a7c046c4813114c63f3c4d27e7  2.pdf

On the other hand, no such issue exists when saving png files.

$ python genfigure.py 1.png
$ python genfigure.py 2.png

The two png files are exactly the same

$ md5sum 1.png 2.png
5d22187827337cd9262ee248550fab6f  1.png
5d22187827337cd9262ee248550fab6f  2.png

It appears that pdf saving has some source of non-determinism.

Is there a way to ensure that saving the same figure multiple times, results in exactly the same pdf file?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions