I am using matplotlib.pyplot to create histograms. I'm not actually interested in the plots of these histograms, but interested in the frequencies and bins (I know I can write my own code to do this, but would prefer to use this package).
I know I can do the following,
import numpy as np
import matplotlib.pyplot as plt
x1 = np.random.normal(1.5,1.0)
x2 = np.random.normal(0,1.0)
freq, bins, patches = plt.hist([x1,x1],50,histtype='step')
to create a histogram. All I need is freq[0]
, freq[1]
, and bins[0]
. The problem occurs when I try and use,
freq, bins, patches = plt.hist([x1,x1],50,histtype='step')
in a function. For example,
def func(x, y, Nbins):
freq, bins, patches = plt.hist([x,y],Nbins,histtype='step') # create histogram
bincenters = 0.5*(bins[1:] + bins[:-1]) # center bins
xf= [float(i) for i in freq[0]] # convert integers to float
xf = [float(i) for i in freq[1]]
p = [ (bincenters[j], (1.0 / (xf[j] + yf[j] )) for j in range(Nbins) if (xf[j] + yf[j]) != 0]
Xt = [j for i,j in p] # separate pairs formed in p
Yt = [i for i,j in p]
Y = np.array(Yt) # convert to arrays for later fitting
X = np.array(Xt)
return X, Y # return arrays X and Y
When I call func(x1,x2,Nbins)
and plot or print X
and Y
, I do not get my expected curve/values. I suspect it something to do with plt.hist
, since there is a partial histogram in my plot.
In Matplotlib, we use the hist() function to create histograms. The hist() function will use an array of numbers to create a histogram, the array is sent into the function as an argument.
Click on the Image Control bar to select the display style of histograms. The histogram style will change. Click on the Image Control bar again to hide the histograms.
The hist() function in pyplot module of matplotlib library is used to plot a histogram.
You can use np.histogram2d (for 2D histogram) or np.histogram (for 1D histogram):
hst = np.histogram(A, bins)
hst2d = np.histogram2d(X,Y,bins)
Output form will be the same as plt.hist
and plt.hist2d
, the only difference is there is no plot.
I don't know if I'm understanding your question very well, but here, you have an example of a very simple home-made histogram (in 1D or 2D), each one inside a function, and properly called:
import numpy as np
import matplotlib.pyplot as plt
def func2d(x, y, nbins):
histo, xedges, yedges = np.histogram2d(x,y,nbins)
plt.plot(x,y,'wo',alpha=0.3)
plt.imshow(histo.T,
extent=[xedges.min(),xedges.max(),yedges.min(),yedges.max()],
origin='lower',
interpolation='nearest',
cmap=plt.cm.hot)
plt.show()
def func1d(x, nbins):
histo, bin_edges = np.histogram(x,nbins)
bin_center = 0.5*(bin_edges[1:] + bin_edges[:-1])
plt.step(bin_center,histo,where='mid')
plt.show()
x = np.random.normal(1.5,1.0, (1000,1000))
func1d(x[0],40)
func2d(x[0],x[1],40)
Of course, you may check if the centering of the data is right, but I think that the example shows some useful things about this topic.
My recommendation: Try to avoid any loop in your code! They kill the performance. If you look, In my example there aren't loops. The best practice in numerical problems with python is avoiding loops! Numpy has a lot of C-implemented functions that do all the hard looping work.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With