Closed
Description
Suppose you want to generate a pdf file with matplotlib and save it.
genfigure.py
:
import matplotlib.pyplot as plt
import sys
plt.plot([0, 1], [0, 1])
plt.savefig(sys.argv[1])
Run the script from the command line
$ python genfigure.py 1.pdf
$ python genfigure.py 2.pdf
Given that we are saving the same figure, we would expect the output to be the same. However, after looking at the file hashes, they appear to be different. In my particular case:
$ md5sum 1.pdf 2.pdf
e54cdbd65a6baaa5152d90743d800039 1.pdf
4b0ac2a7c046c4813114c63f3c4d27e7 2.pdf
On the other hand, no such issue exists when saving png files.
$ python genfigure.py 1.png
$ python genfigure.py 2.png
The two png
files are exactly the same
$ md5sum 1.png 2.png
5d22187827337cd9262ee248550fab6f 1.png
5d22187827337cd9262ee248550fab6f 2.png
It appears that pdf saving has some source of non-determinism.
Is there a way to ensure that saving the same figure multiple times, results in exactly the same pdf file?