I am generating and saving SVG images using matplotlib
and would like to make them as reproducible as possible. However, even after setting np.random.seed
and random.seed
, the various id
and xlink:href
values in the SVG images still change between runs of my code.
I assume these differences are due to the backend that matplotlib
uses to render SVG images. Is there any way to set the seed for this backend such that identical plots produce identical output between two different runs of the code?
Sample code (run this twice, changing the name in plt.savefig
for the second run):
import random
import numpy as np
import matplotlib.pyplot as plt
random.seed(42)
np.random.seed(42)
x, y = np.random.randn(4096), np.random.randn(4096)
heatmap, xedges, yedges = np.histogram2d(x, y, bins=(64,64))
fig, axis = plt.subplots()
plt.savefig("random_1.svg")
Compare files:
diff random_1.svg random_2.svg | head
35c35
< " id="md3b71b67b7" style="stroke:#000000;stroke-width:0.8;"/>
---
> " id="m7ee1b067d8" style="stroke:#000000;stroke-width:0.8;"/>
38c38
< <use style="stroke:#000000;stroke-width:0.8;" x="57.6" xlink:href="#md3b71b67b7" y="307.584"/>
---
> <use style="stroke:#000000;stroke-width:0.8;" x="57.6" xlink:href="#m7ee1b067d8" y="307.584"/>
82c82
< <use style="stroke:#000000;stroke-width:0.8;" x="129.024" xlink:href="#md3b71b67b7" y="307.584"/>
Matplotlib is a plotting library. It relies on some backend to actually render the plots. The default backend is the agg backend. This backend only renders PNGs. On Jupyter notebooks the matplotlib backends are special as they are rendered to the browser.
Why matplotlib inline is used. You can use the magic function %matplotlib inline to enable the inline plotting, where the plots/graphs will be displayed just below the cell where your plotting commands are written. It provides interactivity with the backend in the frontends like the jupyter notebook.
Matplotlib is not thread-safe: in fact, there are known race conditions that affect certain artists. Hence, if you work with threads, it is your responsibility to set up the proper locks to serialize access to Matplotlib artists.
There is an option svg.hashsalt
in matplotlib's rcParams which seems to be used exactly for that purpose:
# svg backend params
#svg.image_inline : True # write raster image data directly into the svg file
#svg.fonttype : 'path' # How to handle SVG fonts:
# 'none': Assume fonts are installed on the machine where the SVG will be viewed.
# 'path': Embed characters as paths -- supported by most SVG renderers
# 'svgfont': Embed characters as SVG fonts -- supported only by Chrome,
# Opera and Safari
svg.hashsalt : None # if not None, use this string as hash salt
# instead of uuid4
The following code produces two exactly identical files, down to the XML ids
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
mpl.rcParams['svg.hashsalt'] = 42
np.random.seed(42)
x, y = np.random.randn(4096), np.random.randn(4096)
fig, ax = plt.subplots()
ax.hist(x)
for i in [1,2]:
plt.savefig("random_{}.svg".format(i))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With