I'd like to plot several boxplots in one figure, on one axis. The data that I use for the boxplots, however, is too large to be read into memory at once. So I read it in chunks using pandas read_csv(). What I would like to do is to produce some boxplots in each iteration and add the new boxplots from iteration i to the same figure as the boxplots from iteration i-1, wihtout holding on to the data of iteration i-1.
I want to stress, that I do not need to update the data for an already existing boxplot. It's more like I get a new data column with each iteration and I want to display a boxplot of that column next to the existing boxplot.
E.g.: Say I have
df = pd.DataFrame(np.random.rand(100,2))
Assume that I could read the columns only one after the other. How do I add the boxplot of the second column to the already existing boxplot of the first column to have the same result as ax.boxplot(df.values)?
The boxplot method has a positions
argument. Using that, you can guarantee, in a loop, that each boxplot (or multiple ones) is set in it's position.
Here's some code:
In [17]: x = pds.DataFrame(np.random.randn(10, 10))
In [18]: fig = plt.figure()
In [19]: ax = plt.subplot(111)
In [20]: for i in range(10):
...: ax.boxplot(x.ix[:,i].values, positions = [i])
...:
In [21]: ax.set_xlim(-0.5, 9.5)
In [22]: plt.show()
Please note the following update, this:
ax.boxplot(x.ix[:,i].values, positions = [i])
Should be replaced by:
ax.boxplot(x.iloc[:,i].values, positions = [i])
as ix is deprecated.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With