I have a list containing more than 100,000 values in it.
I need to divide the list into multiple smaller lists based on a specific bin width say 0.1. Can anyone help me how to write a python program to do this?
my list looks like this
-0.234
-0.04325
-0.43134
-0.315
-0.6322
-0.245
-0.5325
-0.6341
-0.5214
-0.531
-0.124
-0.0252
I would like to have an output like this
list1 = [-0.04325, -0.0252]
list2 = [-0.124]
list3 = [-0.234, -0.245 ]
list4 = [-0.315]
list5 = [-0.43134]
list6 = [-0.5325, -0.5214, -0.531]
list7 = [-0.6322, -0.6341]
Here is a simple and nice way using numpys digitize:
>>> import numpy as np
>>> mylist = np.array([-0.234, -0.04325, -0.43134, -0.315, -0.6322, -0.245,
-0.5325, -0.6341, -0.5214, -0.531, -0.124, -0.0252])
>>> bins = np.arange(0,-1,-0.1)
>>> for i in xrange(1,10):
... mylist[np.digitize(mylist,bins)==i]
...
array([-0.04325, -0.0252 ])
array([-0.124])
array([-0.234, -0.245])
array([-0.315])
array([-0.43134])
array([-0.5325, -0.5214, -0.531 ])
array([-0.6322, -0.6341])
array([], dtype=float64)
array([], dtype=float64)
digitize, returns an array with the index value for the bin that each element falls into.
Binning can be done with itertools.groupby
:
import itertools as it
iterable = ['-0.234', '-0.04325', '-0.43134', '-0.315', '-0.6322', '-0.245',
'-0.5325', '-0.6341', '-0.5214', '-0.531', '-0.124', '-0.0252']
a,b,c,d,e,f,g = [list(g) for k, g in it.groupby(sorted(iterable), key=lambda x: x[:4])]
c
# ['-0.234', '-0.245']
Note: this simple key function assumes the values in the iterable are between -0.0 and -10.0. Consider lambda x: "{:.1f}".format(float(x))
for general cases.
See also this post for details on how groupby
works.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With