Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Matplotlib Agg Rendering Complexity Error

I am trying to print a 600 dpi graph using Python matplotlib. However Python plotted 2 out of 8 graphs, and output the error:

OverflowError: Agg rendering complexity exceeded. Consider downsampling or decimating your data.

I am plotting a huge chunk of data (7,500,000 data per column) so I guess either that would be some overloading problem or that I need to set a large cell_block_limit.

I tried searching for the solutions for changing a cell_block_limit on Google but to no avail. What would be a good approach?

The code as follows:-

        import matplotlib.pyplot as plt
        from matplotlib.ticker import MultipleLocator, FormatStrFormatter

        majorLocator   = MultipleLocator(200)
        majorFormatter = FormatStrFormatter('%d')
        minorLocator   = MultipleLocator(20)

        fig = plt.figure()
        ax = fig.add_subplot(111)
        ax.xaxis.set_major_locator(majorLocator)
        ax.xaxis.set_major_formatter(majorFormatter)
        ax.xaxis.set_minor_locator(minorLocator)
        ax.xaxis.set_ticks_position('bottom')
        ax.xaxis.grid(True,which='minor')
        ax.yaxis.grid(True)
        plt.plot(timemat,fildata)
        plt.xlabel(plotxlabel,fontsize=14)
        plt.ylabel(plotylabel,fontsize=14)      
        plt.title(plottitle,fontsize=16)
        fig.savefig(plotsavetitle,dpi=600)
like image 563
Harry MacDowel Avatar asked Jan 16 '12 05:01

Harry MacDowel


People also ask

Why is matplotlib using AGG?

The last, Agg, is a non-interactive backend that can only write to files. It is used on Linux, if Matplotlib cannot connect to either an X display or a Wayland display.

Is PLT show () blocking?

show() and plt. draw() are unnecessary and / or blocking in one way or the other.

Why is matplotlib not working?

Occasionally, problems with Matplotlib can be solved with a clean installation of the package. In order to fully remove an installed Matplotlib: Delete the caches from your Matplotlib configuration directory. Delete any Matplotlib directories or eggs from your installation directory.

What does rcParams do in Python?

Changing the Defaults: rcParams Each time Matplotlib loads, it defines a runtime configuration (rc) containing the default styles for every plot element you create. This configuration can be adjusted at any time using the plt.


2 Answers

In addition to @Lennart's point that there's no need for the full resolution, you might also consider a plot similar to the following.

Calculating the max/mean/min of a "chunked" version is very simple and efficient if you use a 2D view of the original array and the axis keyword arg to x.min(), x.max(), etc.

Even with the filtering, plotting this is much faster than plotting the full array.

(Note: to plot this many points, you'll have to tune down the noise level a bit. Otherwise you'll get the OverflowError you mentioned. If you want to compare plotting the "full" dataset, change the y += 0.3 * y.max() np.random... line to more like 0.1 or remove it completely.)

import matplotlib.pyplot as plt
import numpy as np
np.random.seed(1977)

# Generate some very noisy but interesting data...
num = 1e7
x = np.linspace(0, 10, num)
y = np.random.random(num) - 0.5
y.cumsum(out=y) 
y += 0.3 * y.max() * np.random.random(num)

fig, ax = plt.subplots()

# Wrap the array into a 2D array of chunks, truncating the last chunk if 
# chunksize isn't an even divisor of the total size.
# (This part won't use _any_ additional memory)
chunksize = 10000
numchunks = y.size // chunksize 
ychunks = y[:chunksize*numchunks].reshape((-1, chunksize))
xchunks = x[:chunksize*numchunks].reshape((-1, chunksize))

# Calculate the max, min, and means of chunksize-element chunks...
max_env = ychunks.max(axis=1)
min_env = ychunks.min(axis=1)
ycenters = ychunks.mean(axis=1)
xcenters = xchunks.mean(axis=1)

# Now plot the bounds and the mean...
ax.fill_between(xcenters, min_env, max_env, color='gray', 
                edgecolor='none', alpha=0.5)
ax.plot(xcenters, ycenters)

fig.savefig('temp.png', dpi=600)

enter image description here

like image 79
Joe Kington Avatar answered Nov 11 '22 01:11

Joe Kington


With 600dpi you would have to make the plot 13 meters wide to plot that data without decimating it. :-)

I would suggest chunking the data into pieces a couple of hundred or maybe even a thousand samples long, and extracting the maximum value out of that.

Something like this:

def chunkmax(data, chunk_size):
    source = iter(data)
    chunk = []
    while True:
        for i in range(chunk_size):
            chunk.append(next(source))

        yield max(chunk)

This would then, with a chunk_size of 1000 give you 7500 points to plot, where you then easily can see where in the data the shock comes. (Unless the data is so noisy you would have to average it to see if there is a chock or not. But that's also easily fixable).

like image 26
Lennart Regebro Avatar answered Nov 11 '22 00:11

Lennart Regebro