Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python histograms: Manually normalising counts and re-plotting as histogram

I tried searching for something similar, and the closest thing I could find was this which helped me to extract and manipulate the data, but now I can't figure out how to re-plot the histogram. I have some array of voltages, and I have first plotted a histogram of occurrences of those voltages. I want to instead make a histogram of events per hour ( so the y-axis of a normal histogram divided by the number of hours I took data ) and then re-plot the histogram with the manipulated y data.

I have an array which contains the number of events per hour ( composed of the original y axis from pyplot.hist divided by the number of hours data was taken ), and the bins from the histogram. I have composed that array using the following code ( taken from the answer linked above ):

import numpy
import matplotlib.pyplot as pyplot
mydata = numpy.random.normal(-15, 1, 500)      # this seems to have to be 'uneven' on either side of 0, otherwise the code looks fine. FYI, my actual data is all positive
pyplot.figure(1)
hist1 = pyplot.hist(mydata, bins=50, alpha=0.5, label='set 1', color='red')
hist1_flux = [hist1[0]/5.0, 0.5*(hist1[1][1:]+hist1[1][:-1])]
pyplot.figure(2)
pyplot.bar(hist1_flux[1], hist1_flux[0])

This code doesn't exactly match what's going on in my code; my data is composed of 1000 arrays of 1000 data points each ( voltages ). I have made histograms of that, which gives me number of occurrences of a given voltage range ( or bin width ). All I want to do is re-plot a histogram of the number of events per hour (so yaxis of the histogram / 5 hours) with the same original bin width, but when I divide hist1[0]/5 and replot in the above way, the 'bin width' is all wrong.

I feel like there must be an easier way to do this, rather than manually replotting my own histograms.

Thanks in advance, and I'm really sorry if I've missed something obvious.

The problem, illustrated in the output of my sample code AND my original data is as follows:

Upper plots: code snippet output.
Lower plots: My actual data. Upper plots: code snippet output. Lower plots: My actual data.

like image 907
la femme cosmique Avatar asked Jan 27 '16 10:01

la femme cosmique


1 Answers

It's because the bar function takes an argument width, which is by default 0.8 (plt.bar(left, height, width=0.8, bottom=None, hold=None, **kwargs)), so you need to change it to the distance between two bars:

pyplot.bar(hist1_flux[1], hist1_flux[0],
           width=hist1_flux[1][1] - hist1_flux[1][0])
like image 98
jrjc Avatar answered Nov 03 '22 20:11

jrjc