Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to plot result of np.histogram with matplotlib analog to plt.hist [duplicate]

I like to plot my histograms like this:

data = [-0.5, 0.5, 0.5, 0.5, 
    1.5, 2.1, 2.2, 2.3, 2.4, 2.5, 3.1, 3.2]

plt.hist(data, bins=5, range=[-1, 4], histtype='step')

Now, when I have somehow large input data (larger than my memory), I need to fill the histogram chunk by chunk. E.g. like this:

H, bins = np.histogram([], bins=5, range=[-1, 4])
for data in a_lot_of_input_files:
    H += np.histogram(data, bins=5, range=[-1, 4])[0]

But the question is always, "How do I plot this H again, so it looks just like the previous matplotlib version.

The solution I came up with, looks like this:

plt.plot(bins, np.insert(H, 0, H[0]), '-', drawstyle='steps')

Two different versions of plotting a histogram.

However, neither looks the result identical, nor does it feel very nice to create a copy of H for plotting it.

Is there some elegant solution I am missing? (I did not yet try to use plt.bar, because the bar-graphs don't work nicely, when one wants to compare histograms)

like image 838
Dominik Neise Avatar asked Oct 27 '15 15:10

Dominik Neise


People also ask

What are the values returned by NP histogram ()?

The function has two return values hist which gives the array of values of the histogram, and edge_bin which is an array of float datatype containing the bin edges having length one more than the hist.

How do you plot a histogram using Numpy in Python?

plt() Matplotlib can convert this numeric representation of histogram into a graph. The plt() function of pyplot submodule takes the array containing the data and bin array as parameters and converts into a histogram.

How do you plot a histogram with different variables in Python?

plt. hist() method is used multiple times to create a figure of three overlapping histograms. we adjust opacity, color, and number of bins as needed. Three different columns from the data frame are taken as data for the histograms.

How do you plot a normalized histogram in Python?

To normalize a histogram in Python, we can use hist() method. In normalized bar, the area underneath the plot should be 1.


1 Answers

Not sure what you mean by "bar-graphs don't work nicely, when one wants to compare histograms",

One way to do this is with plt.bar:

import matplotlib.pyplot as plt
import numpy as np

data = [-0.5, 0.5, 0.5, 0.5, 
    1.5, 2.1, 2.2, 2.3, 2.4, 2.5, 3.1, 3.2]

plt.hist(data, bins=5, range=[-1, 4], histtype='step',edgecolor='r',linewidth=3)
H, bins = np.histogram(data[:6], bins=5, range=[-1, 4])
H+=np.histogram(data[6:], bins=5,range=[-1, 4])[0]

plt.bar(bins[:-1],H,width=1)

plt.show()

enter image description here

An alternative is plt.step:

import matplotlib.pyplot as plt
import numpy as np

data = [-0.5, 0.5, 0.5, 0.5, 
    1.5, 2.1, 2.2, 2.3, 2.4, 2.5, 3.1, 3.2]

plt.hist(data, bins=5, range=[-1, 4], histtype='step',edgecolor='r')
H, bins = np.histogram(data[:6], bins=5, range=[-1, 4])
H+=np.histogram(data[6:], bins=5,range=[-1, 4])[0]

bincentres = [(bins[i]+bins[i+1])/2. for i in range(len(bins)-1)]
plt.step(bincentres,H,where='mid',color='b',linestyle='--')

plt.ylim(0,6)

plt.show()

The edges don't quite extend all the way, so you might need to add a 0-bin to either end if that's a big problem for you

enter image description here

like image 82
tmdavison Avatar answered Oct 18 '22 17:10

tmdavison