Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

matplotlib bitmap plot with vector text

So, I'm plotting a waveform (and other things) that result in a bigger vector file (PDF) than the corresponding raster file (PNG). I imagine this is because the dataset plotted is very large and there are millions of instructions in the vector file. Other than being bigger, the PDF is also quite hard for the PDF reader to display. On some, it takes a few seconds to load; on others, it doesn't load at all.

In pyplot, is it possible to have a bitmap plot with vector axes, labels and all other text?

My (very bad) solution at the moment is to generate the PDF, generate the PNG, open the PDF with inkscape and replace the plot with the PNG one. Obviously this is too manual and very time consuming if you realise you have to regenerate the plot.

like image 736
gozzilli Avatar asked Jul 18 '13 17:07

gozzilli


2 Answers

It should be as simple as passing in rasterized=True to the plot command.

E.g.

import matplotlib.pyplot as plt

plt.plot(range(10), rasterized=True)
plt.savefig('test.pdf')

For me, this results in a pdf with a rasterized line (the resolution is controlled by the dpi you specified with savefig -- by default, it's 100) and vector text.

like image 188
Joe Kington Avatar answered Oct 02 '22 11:10

Joe Kington


I use a dirty "fix" for this problem. I simply produce the plot twice. Once I remove all the frames, titles, etc. and save as a png and in the other case, I remove the actual data and save all the components that I want as vector data in a pdf. Then I use ImageMagick to convert the png into a pdf containing bitmap data and overlay the vector data from the pdf using pdftk. Here is a pcolor example from the matplotlib page adapted in the way I just described.

import matplotlib.pyplot as plt
import numpy as np
import os

for case in ['frame','data']:

    # make these smaller to increase the resolution                                                                                                  
    dx, dy = 0.02, 0.02

    # generate 2 2d grids for the x & y bounds                                                                                                       
    y, x = np.mgrid[slice(-3, 3 + dy, dy),
                    slice(-3, 3 + dx, dx)]
    z = (1 - x / 2. + x ** 5 + y ** 3) * np.exp(-x ** 2 - y ** 2)
    # x and y are bounds, so z should be the value *inside* those bounds.                                                                            
    # Therefore, remove the last value from the z array.                                                                                             
    z = z[:-1, :-1]
    z_min, z_max = -np.abs(z).max(), np.abs(z).max()


    fig=plt.figure()
    ax=fig.add_subplot(1,1,1)
    im=plt.pcolor(x, y, z, cmap='RdBu', vmin=z_min, vmax=z_max)
    plt.title('pcolor')
    # set the limits of the plot to the limits of the data                                                                                           
    plt.axis([x.min(), x.max(), y.min(), y.max()])

    if case is 'frame':
        im.remove()
        plt.savefig("frame.pdf",transparent=True)
    if case is 'data':
        ax.axison=False
        plt.title('')
        plt.savefig("data.png",transparent=True)



os.system('convert data.png data.pdf')
os.system('pdftk frame.pdf background data.pdf output final_plot.pdf')
os.system('rm data.png data.pdf frame.pdf')

Basically it is just an automatized version of what you are already doing...

like image 21
Andreas Bleuler Avatar answered Oct 02 '22 10:10

Andreas Bleuler