I am trying to print a 600 dpi graph using Python matplotlib. However Python plotted 2 out of 8 graphs, and output the error:
OverflowError: Agg rendering complexity exceeded. Consider downsampling or decimating your data.
I am plotting a huge chunk of data (7,500,000 data per column) so I guess either that would be some overloading problem or that I need to set a large cell_block_limit.
I tried searching for the solutions for changing a cell_block_limit on Google but to no avail. What would be a good approach?
The code as follows:-
import matplotlib.pyplot as plt
from matplotlib.ticker import MultipleLocator, FormatStrFormatter
majorLocator = MultipleLocator(200)
majorFormatter = FormatStrFormatter('%d')
minorLocator = MultipleLocator(20)
fig = plt.figure()
ax = fig.add_subplot(111)
ax.xaxis.set_major_locator(majorLocator)
ax.xaxis.set_major_formatter(majorFormatter)
ax.xaxis.set_minor_locator(minorLocator)
ax.xaxis.set_ticks_position('bottom')
ax.xaxis.grid(True,which='minor')
ax.yaxis.grid(True)
plt.plot(timemat,fildata)
plt.xlabel(plotxlabel,fontsize=14)
plt.ylabel(plotylabel,fontsize=14)
plt.title(plottitle,fontsize=16)
fig.savefig(plotsavetitle,dpi=600)
The last, Agg, is a non-interactive backend that can only write to files. It is used on Linux, if Matplotlib cannot connect to either an X display or a Wayland display.
show() and plt. draw() are unnecessary and / or blocking in one way or the other.
Occasionally, problems with Matplotlib can be solved with a clean installation of the package. In order to fully remove an installed Matplotlib: Delete the caches from your Matplotlib configuration directory. Delete any Matplotlib directories or eggs from your installation directory.
Changing the Defaults: rcParams Each time Matplotlib loads, it defines a runtime configuration (rc) containing the default styles for every plot element you create. This configuration can be adjusted at any time using the plt.
In addition to @Lennart's point that there's no need for the full resolution, you might also consider a plot similar to the following.
Calculating the max/mean/min of a "chunked" version is very simple and efficient if you use a 2D view of the original array and the axis
keyword arg to x.min()
, x.max()
, etc.
Even with the filtering, plotting this is much faster than plotting the full array.
(Note: to plot this many points, you'll have to tune down the noise level a bit. Otherwise you'll get the OverflowError
you mentioned. If you want to compare plotting the "full" dataset, change the y += 0.3 * y.max() np.random...
line to more like 0.1
or remove it completely.)
import matplotlib.pyplot as plt
import numpy as np
np.random.seed(1977)
# Generate some very noisy but interesting data...
num = 1e7
x = np.linspace(0, 10, num)
y = np.random.random(num) - 0.5
y.cumsum(out=y)
y += 0.3 * y.max() * np.random.random(num)
fig, ax = plt.subplots()
# Wrap the array into a 2D array of chunks, truncating the last chunk if
# chunksize isn't an even divisor of the total size.
# (This part won't use _any_ additional memory)
chunksize = 10000
numchunks = y.size // chunksize
ychunks = y[:chunksize*numchunks].reshape((-1, chunksize))
xchunks = x[:chunksize*numchunks].reshape((-1, chunksize))
# Calculate the max, min, and means of chunksize-element chunks...
max_env = ychunks.max(axis=1)
min_env = ychunks.min(axis=1)
ycenters = ychunks.mean(axis=1)
xcenters = xchunks.mean(axis=1)
# Now plot the bounds and the mean...
ax.fill_between(xcenters, min_env, max_env, color='gray',
edgecolor='none', alpha=0.5)
ax.plot(xcenters, ycenters)
fig.savefig('temp.png', dpi=600)
With 600dpi you would have to make the plot 13 meters wide to plot that data without decimating it. :-)
I would suggest chunking the data into pieces a couple of hundred or maybe even a thousand samples long, and extracting the maximum value out of that.
Something like this:
def chunkmax(data, chunk_size):
source = iter(data)
chunk = []
while True:
for i in range(chunk_size):
chunk.append(next(source))
yield max(chunk)
This would then, with a chunk_size of 1000 give you 7500 points to plot, where you then easily can see where in the data the shock comes. (Unless the data is so noisy you would have to average it to see if there is a chock or not. But that's also easily fixable).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With