Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Set random seed for matplotlib plotting backend

I am generating and saving SVG images using matplotlib and would like to make them as reproducible as possible. However, even after setting np.random.seed and random.seed, the various id and xlink:href values in the SVG images still change between runs of my code.

I assume these differences are due to the backend that matplotlib uses to render SVG images. Is there any way to set the seed for this backend such that identical plots produce identical output between two different runs of the code?

Sample code (run this twice, changing the name in plt.savefig for the second run):

import random
import numpy as np
import matplotlib.pyplot as plt

random.seed(42)
np.random.seed(42)

x, y = np.random.randn(4096), np.random.randn(4096)
heatmap, xedges, yedges = np.histogram2d(x, y, bins=(64,64))

fig, axis = plt.subplots()
plt.savefig("random_1.svg")

Compare files:

diff random_1.svg random_2.svg | head
35c35
< " id="md3b71b67b7" style="stroke:#000000;stroke-width:0.8;"/>
---
> " id="m7ee1b067d8" style="stroke:#000000;stroke-width:0.8;"/>
38c38
<        <use style="stroke:#000000;stroke-width:0.8;" x="57.6" xlink:href="#md3b71b67b7" y="307.584"/>
---
>        <use style="stroke:#000000;stroke-width:0.8;" x="57.6" xlink:href="#m7ee1b067d8" y="307.584"/>
82c82
<        <use style="stroke:#000000;stroke-width:0.8;" x="129.024" xlink:href="#md3b71b67b7" y="307.584"/>
like image 394
saladi Avatar asked Jan 05 '18 05:01

saladi


People also ask

What is matplotlib use (' AGG ')?

Matplotlib is a plotting library. It relies on some backend to actually render the plots. The default backend is the agg backend. This backend only renders PNGs. On Jupyter notebooks the matplotlib backends are special as they are rendered to the browser.

Why is %Matplotlib inline?

Why matplotlib inline is used. You can use the magic function %matplotlib inline to enable the inline plotting, where the plots/graphs will be displayed just below the cell where your plotting commands are written. It provides interactivity with the backend in the frontends like the jupyter notebook.

Why is matplotlib not thread safe?

Matplotlib is not thread-safe: in fact, there are known race conditions that affect certain artists. Hence, if you work with threads, it is your responsibility to set up the proper locks to serialize access to Matplotlib artists.


1 Answers

There is an option svg.hashsalt in matplotlib's rcParams which seems to be used exactly for that purpose:

# svg backend params
#svg.image_inline : True       # write raster image data directly into the svg file
#svg.fonttype : 'path'         # How to handle SVG fonts:
#    'none': Assume fonts are installed on the machine where the SVG will be viewed.
#    'path': Embed characters as paths -- supported by most SVG renderers
#    'svgfont': Embed characters as SVG fonts -- supported only by Chrome,
#               Opera and Safari
svg.hashsalt : None           # if not None, use this string as hash salt
                              # instead of uuid4

The following code produces two exactly identical files, down to the XML ids

import numpy             as np
import matplotlib        as mpl
import matplotlib.pyplot as plt

mpl.rcParams['svg.hashsalt'] = 42
np.random.seed(42)

x, y = np.random.randn(4096), np.random.randn(4096)

fig, ax = plt.subplots()
ax.hist(x)

for i in [1,2]:
    plt.savefig("random_{}.svg".format(i))
like image 142
Diziet Asahi Avatar answered Oct 24 '22 14:10

Diziet Asahi