Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Divide a list into multiple lists based on a bin size

Tags:

python

I have a list containing more than 100,000 values in it.

I need to divide the list into multiple smaller lists based on a specific bin width say 0.1. Can anyone help me how to write a python program to do this?

my list looks like this

-0.234
-0.04325
-0.43134
-0.315
-0.6322
-0.245
-0.5325
-0.6341
-0.5214
-0.531
-0.124
-0.0252

I would like to have an output like this

list1 = [-0.04325, -0.0252] 
list2 = [-0.124] 
list3 = [-0.234, -0.245 ] 
list4 = [-0.315] 
list5 = [-0.43134] 
list6 = [-0.5325, -0.5214, -0.531] 
list7 = [-0.6322, -0.6341]
like image 403
user1492449 Avatar asked Jun 30 '12 02:06

user1492449


2 Answers

Here is a simple and nice way using numpys digitize:

>>> import numpy as np
>>> mylist = np.array([-0.234, -0.04325, -0.43134, -0.315, -0.6322, -0.245,
                       -0.5325, -0.6341, -0.5214, -0.531, -0.124, -0.0252])
>>> bins = np.arange(0,-1,-0.1)
>>> for i in xrange(1,10):
...     mylist[np.digitize(mylist,bins)==i]
... 
array([-0.04325, -0.0252 ])
array([-0.124])
array([-0.234, -0.245])
array([-0.315])
array([-0.43134])
array([-0.5325, -0.5214, -0.531 ])
array([-0.6322, -0.6341])
array([], dtype=float64)
array([], dtype=float64)

digitize, returns an array with the index value for the bin that each element falls into.

like image 165
fraxel Avatar answered Oct 20 '22 00:10

fraxel


Binning can be done with itertools.groupby:

import itertools as it


iterable = ['-0.234', '-0.04325', '-0.43134', '-0.315', '-0.6322', '-0.245',
            '-0.5325', '-0.6341', '-0.5214', '-0.531', '-0.124', '-0.0252']

a,b,c,d,e,f,g = [list(g) for k, g in it.groupby(sorted(iterable), key=lambda x: x[:4])]
c
# ['-0.234', '-0.245']

Note: this simple key function assumes the values in the iterable are between -0.0 and -10.0. Consider lambda x: "{:.1f}".format(float(x)) for general cases.

See also this post for details on how groupby works.

like image 23
pylang Avatar answered Oct 19 '22 23:10

pylang