I have two different dataframes with 19 variables each and I'm plotting a multiple plot with the histograms of each variable like this:
fig, ax = plt.subplots(figsize=(19,10), dpi=50)
dataframe1.hist(ax=ax, layout=(3,7), alpha=0.5)
fig, ax = plt.subplots(figsize=(19,10), dpi=50)
dataframe2.hist(ax=ax, layout=(3,7), alpha=0.5)
This produce two images with 19 histograms inside. What I want to try is to plot only one image with the shared histograms in the same subplot.
I tried this:
fig, ax = plt.subplots(figsize=(19,10), dpi=50)
dataframe1.hist(ax=ax, layout=(3,7), alpha=0.5, label='x')
dataframe2.hist(ax=ax, layout=(3,7), alpha=0.5, label='y', color='red')
But its only painting the last one. This is a similar example: Plot two histograms at the same time with matplotlib but how could I apply it two my 19 subplots?
Any ideas will be welcomed, thanks in advance!
P.S: I'm currently using Jupyter Notebooks with the %matplotlib notebook option
Your problem is that you create only one Axes
object in your plt.subplots
call, when you actually need 21 (3x7). As the amount of subplots provided does not match the amount of subplots requested, pandas creates new subplots. Because this happens twice, you only see the second set of histograms.
You can leave out the call to subplots
altogether and let pandas do all the work. The call to hist
returns all the subplots needed and this can then be used in the second call to hist
.
EDIT:
I realised that, if the amount of desired plots is not actually equal to the amount of grid cells (in this case 3x9=21), you must pass exactly the amount of subplots that you actually want to plot on (in this case 19). However, the call to df.hist
returns a subplot for each grid cell (i.e. 21) and apparently hides the unused ones. Hence you have to pass only a subset of all returned subplots to the second call to hist
. This is easiest done by converting the 2d array of subplots into a 1d array and then slicing this array, for instance with `axes.ravel()[:19]. I edited the code accordingly:
import numpy as np
from matplotlib import pyplot as plt
import pandas as pd
length=19
loc = np.random.randint(0,50,size=length)
scale = np.random.rand(length)*10
dist = np.random.normal(loc=loc, scale=scale, size=(100,length))
df1 = pd.DataFrame(data=list(dist))
axes = df1.hist(layout=(3,7), alpha=0.5, label='x')
loc = np.random.randint(0,50,size=length)
scale = np.random.rand(length)*10
dist = np.random.normal(loc=loc, scale=scale, size=(100,length))
df2 = pd.DataFrame(data=list(dist))
df2.hist(ax=axes.ravel()[:length], layout=(3,7), alpha=0.5, label='x',color='r')
plt.show()
This produces output like this:
When you call subplots
, you can specify the number of rows and columns that you want. In your case, you want 3 rows and 7 columns. However, .plot
will be annoyed at there being 21 axes but only 19 to plot from your dataframe. So instead, we'll flatten the axes into a list and convert to a list, which will allow us to remove the last two from both the figure and the set of axes simultaneously through .pop()
fig, axes = plt.subplots(figsize=(19,10), dpi=50, nrows=3, ncols=7)
flat_axes = list(axes.reshape(-1))
fig.delaxes(flat_axes.pop(-1))
fig.delaxes(flat_axes.pop(-1))
dataframe1.hist(ax=flat_axes, alpha=0.5, label='x')
dataframe2.hist(ax=flat_axes, alpha=0.5, label='x',color='r')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With