Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Matplotlib multiple scatter subplots - reduce svg file size

I generated a plot in Matplotlib which consists of 50 subplots. In each of these subplots I have a scatterplot with about 3000 datapoints. I'm doing this, because I just want to have an overview of the different scatter plots in a document I'm working on.

This also works so far and looks nice, but the problem is obviously that the SVG file that I'm getting is really big (about 15 MB). And Word just can't handle such a big SVG file.

So my question: is there a way to optimize this SVG file? A lot of my datapoints in the scatter plots are overlapping each other, so I guess it should be possible remove many "invisible" ones of them without changing the visible output. (so something like this in illustrator seems to be what I want to do: Link) Is it also possible to do something like this in Inkscape? Or even directly in Matplotlib?

I know that I can just produce a PNG file, but I would prefer to have the plot as a vector graphic in my document.

like image 989
Frank Avatar asked Dec 11 '22 06:12

Frank


1 Answers

If you want to keep all the data points as vector graphics, its unlikely you'll be able to reduce the file size.

While not ideal, one potential option is to rasterize only the data points created by ax.scatter, and leave the axes, labels, titles, etc. all as vector elements on your figure. This can dramatically reduce the file size, and if you set the dpi high enough, you probably won't lose any useful information from the plot.

You can do this by setting rasterized=True when calling ax.scatter.

You can then control the dpi of the rasterized elements using dpi=300 (or whatever dpi you want) when you fig.savefig.

Consider the following:

import matplotlib.pyplot as plt

figV, axesV = plt.subplots(nrows=10, ncols=5)
figR, axesR = plt.subplots(nrows=10, ncols=5)

for ax in figV.axes:
    ax.scatter(range(3000), range(3000))
for ax in figR.axes:
    ax.scatter(range(3000), range(3000), rasterized=True)

figV.savefig('bigscatterV.svg')
figR.savefig('bigscatterR.svg', dpi=300)

bigscatterV.svg has a file size of 16MB, while bigscatterR.svg has a file size of only 250KB.

like image 129
tmdavison Avatar answered Jan 21 '23 09:01

tmdavison