Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Flatten a nested list of variable sized sublists into a SciPy array

How can I use numpy/scipy to flatten a nested list with sublists of different sizes? Speed is very important and the lists are large.

 lst = [[1, 2, 3, 4],[2, 3],[1, 2, 3, 4, 5],[4, 1, 2]]

Is anything faster than this?

 vec = sp.array(list(*chain(lst)))
like image 733
user1728853 Avatar asked Mar 12 '13 15:03

user1728853


2 Answers

The fastest way to create a numpy array from an iterator is to use numpy.fromiter:

>>> %timeit numpy.fromiter(itertools.chain.from_iterable(lst), numpy.int64)
100000 loops, best of 3: 3.76 us per loop
>>> %timeit numpy.array(list(itertools.chain.from_iterable(lst)))
100000 loops, best of 3: 14.5 us per loop
>>> %timeit numpy.hstack(lst)
10000 loops, best of 3: 57.7 us per loop

As you can see, this is faster than converting to a list, and much faster than hstack.

like image 132
senderle Avatar answered Oct 19 '22 17:10

senderle


You can try numpy.hstack

>>> lst = [[1, 2, 3, 4],[2, 3],[1, 2, 3, 4, 5],[4, 1, 2]]
>>> np.hstack(lst)
array([1, 2, 3, 4, 2, 3, 1, 2, 3, 4, 5, 4, 1, 2])
like image 37
Abhijit Avatar answered Oct 19 '22 18:10

Abhijit