I'd like to create a stacked histogram. If I have a single 2-D array, made of three equal length data sets, this is simple. Code and image below:
import numpy as np
from matplotlib import pyplot as plt
# create 3 data sets with 1,000 samples
mu, sigma = 200, 25
x = mu + sigma*np.random.randn(1000,3)
#Stack the data
plt.figure()
n, bins, patches = plt.hist(x, 30, stacked=True, density = True)
plt.show()
However, if I try similar code with three data sets of a different length the results are that one histogram covers up another. Is there any way I can do the stacked histogram with mixed length data sets?
##Continued from above
###Now as three separate arrays
x1 = mu + sigma*np.random.randn(990,1)
x2 = mu + sigma*np.random.randn(980,1)
x3 = mu + sigma*np.random.randn(1000,1)
#Stack the data
plt.figure()
plt.hist(x1, bins, stacked=True, density = True)
plt.hist(x2, bins, stacked=True, density = True)
plt.hist(x3, bins, stacked=True, density = True)
plt.show()
histtype : This parameter is an optional parameter and it is used to draw type of histogram. {'bar', 'barstacked', 'step', 'stepfilled'}
plt. hist() method is used multiple times to create a figure of three overlapping histograms. we adjust opacity, color, and number of bins as needed. Three different columns from the data frame are taken as data for the histograms.
To make multiple overlapping histograms, we need to use Matplotlib pyplot's hist function multiple times. For example, to make a plot with two histograms, we need to use pyplot's hist() function two times. Here we adjust the transparency with alpha parameter and specify a label for each variable.
Well, this is simple. I just need to put the three arrays in a list.
##Continued from above
###Now as three separate arrays
x1 = mu + sigma*np.random.randn(990,1)
x2 = mu + sigma*np.random.randn(980,1)
x3 = mu + sigma*np.random.randn(1000,1)
#Stack the data
plt.figure()
plt.hist([x1,x2,x3], bins, stacked=True, density=True)
plt.show()
pandas
is an option, the arrays can be loaded into a dataframe and plotted.list
of DataFrames
with pandas.DataFrame
, for each array, and then concat
the arrays together in a list-comprehension.
df = pd.DataFrame({'x1': x1, 'x2': x2, 'x3': x3})
pandas.DataFrame.plot
, which uses matplotlib
as the default plot engine.
normed
has been replaced with density
in matplotlib
density
parameter in matplotlib.pyplot.hist
for an explanation of the y-axis values.import pandas as pd
import numpy as np
# create the uneven arrays
mu, sigma = 200, 25
np.random.seed(365)
x1 = mu + sigma*np.random.randn(990, 1)
x2 = mu + sigma*np.random.randn(980, 1)
x3 = mu + sigma*np.random.randn(1000, 1)
# create the dataframe; enumerate is used to make column names
df = pd.concat([pd.DataFrame(a, columns=[f'x{i}']) for i, a in enumerate([x1, x2, x3], 1)], axis=1)
# plot the data
df.plot.hist(stacked=True, bins=30, density=True, figsize=(10, 6), grid=True)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With