Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Howto bin series of float values into histogram in Python?

I have set of value in float (always less than 0). Which I want to bin into histogram, i,e. each bar in histogram contain range of value [0,0.150)

The data I have looks like this:

0.000
0.005
0.124
0.000
0.004
0.000
0.111
0.112

Whith my code below I expect to get result that looks like

[0, 0.005) 5
[0.005, 0.011) 0
...etc.. 

I tried to do do such binning with this code of mine. But it doesn't seem to work. What's the right way to do it?

#! /usr/bin/env python


import fileinput, math

log2 = math.log(2)

def getBin(x):
    return int(math.log(x+1)/log2)

diffCounts = [0] * 5

for line in fileinput.input():
    words = line.split()
    diff = float(words[0]) * 1000;

    diffCounts[ str(getBin(diff)) ] += 1

maxdiff = [i for i, c in enumerate(diffCounts) if c > 0][-1]
print maxdiff
maxBin = max(maxdiff)


for i in range(maxBin+1):
     lo = 2**i - 1
     hi = 2**(i+1) - 1
     binStr = '[' + str(lo) + ',' + str(hi) + ')'
     print binStr + '\t' + '\t'.join(map(str, (diffCounts[i])))

~

like image 628
neversaint Avatar asked Nov 12 '09 10:11

neversaint


People also ask

What is Figsize in histogram?

figsize : tuple (width, height) - The size of the output image. layout : tuple (rows, columns) - The layout in which the output graphs must be, for example, (4, 1) gives the figures in a single column and four rows. bins : int or sequence - Number of histogram bins to be used.

What is Binwidth in histogram?

The towers or bars of a histogram are called bins. The height of each bin shows how many values from that data fall into that range. Width of each bin is = (max value of data – min value of data) / total number of bins. The default value of the number of bins to be created in a histogram is 10.

How do you change the number of bins in a histogram?

To adjust the bin width, right click the horizontal axis on the histogram and then click Format Axis from the dropdown: What is this? In the window that appears to the right, we can see that Excel chose the bin width to be 29,000. We can change this to any number we'd like.


1 Answers

When possible, don't reinvent the wheel. NumPy has everything you need:

#!/usr/bin/env python
import numpy as np

a = np.fromfile(open('file', 'r'), sep='\n')
# [ 0.     0.005  0.124  0.     0.004  0.     0.111  0.112]

# You can set arbitrary bin edges:
bins = [0, 0.150]
hist, bin_edges = np.histogram(a, bins=bins)
# hist: [8]
# bin_edges: [ 0.    0.15]

# Or, if bin is an integer, you can set the number of bins:
bins = 4
hist, bin_edges = np.histogram(a, bins=bins)
# hist: [5 0 0 3]
# bin_edges: [ 0.     0.031  0.062  0.093  0.124]
like image 179
unutbu Avatar answered Sep 17 '22 22:09

unutbu