I have data as a list of floats and I want to plot it as a histogram. Hist() function does the job perfectly for plotting the absolute histogram. However, I cannot figure out how to represent it in a relative frequency format - I would like to have it as a fraction or ideally as a percentage on the y-axis.
Here is the code:
fig = plt.figure() ax = fig.add_subplot(111) n, bins, patches = ax.hist(mydata, bins=100, normed=1, cumulative=0) ax.set_xlabel('Bins', size=20) ax.set_ylabel('Frequency', size=20) ax.legend plt.show()
I thought normed=1 argument would do it, but it gives fractions that are too high and sometimes are greater than 1. They also seem to depend on the bin size, as if they are not normalized by the bin size or something. Nevertheless, when I set cumulative=1, it nicely sums up to 1. So, where is the catch? By the way, when I feed the same data into Origin and plot it, it gives me perfectly correct fractions. Thank you!
A relative frequency histogram is a minor modification of a typical frequency histogram. Rather than using a vertical axis for the count of data values that fall into a given bin, we use this axis to represent the overall proportion of data values that fall into this bin.
A relative frequency , measures how often a certain value occurs in a dataset, relative to the total number of values in that dataset. In order to calculate the relative frequencies, we'll need to divide each absolute frequency by the total number of values in the array.
Because normed option of hist returns the density of points, e.g dN/dx
What you need is something like that:
# assuming that mydata is an numpy array ax.hist(mydata, weights=np.zeros_like(mydata) + 1. / mydata.size) # this will give you fractions
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With