I have a list looks like this:
[[1,2,3],[1,2],[1,4,5,6,7]]
and I want to flatten it into [1,2,3,1,2,1,4,5,6,7]
is there a light weight function to do this without using numpy?
Flattening a list of lists entails converting a 2D list into a 1D list by un-nesting each list item stored in the list of lists - i.e., converting [[1, 2, 3], [4, 5, 6], [7, 8, 9]] into [1, 2, 3, 4, 5, 6, 7, 8, 9] .
Flatten a NumPy array with reshape(-1) You can also use reshape() to convert the shape of a NumPy array to one dimension. If you use -1 , the size is calculated automatically, so you can flatten a NumPy array with reshape(-1) . reshape() is provided as a method of numpy.
Without numpy ( ndarray.flatten
) one way would be using chain.from_iterable
which is an alternate constructor for itertools.chain
:
>>> list(chain.from_iterable([[1,2,3],[1,2],[1,4,5,6,7]])) [1, 2, 3, 1, 2, 1, 4, 5, 6, 7]
Or as another yet Pythonic approach you can use a list comprehension :
[j for sub in [[1,2,3],[1,2],[1,4,5,6,7]] for j in sub]
Another functional approach very suitable for short lists could also be reduce
in Python2 and functools.reduce
in Python3 (don't use this for long lists):
In [4]: from functools import reduce # Python3 In [5]: reduce(lambda x,y :x+y ,[[1,2,3],[1,2],[1,4,5,6,7]]) Out[5]: [1, 2, 3, 1, 2, 1, 4, 5, 6, 7]
To make it slightly faster you can use operator.add
, which is built-in, instead of lambda
:
In [6]: from operator import add In [7]: reduce(add ,[[1,2,3],[1,2],[1,4,5,6,7]]) Out[7]: [1, 2, 3, 1, 2, 1, 4, 5, 6, 7] In [8]: %timeit reduce(lambda x,y :x+y ,[[1,2,3],[1,2],[1,4,5,6,7]]) 789 ns ± 7.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each) In [9]: %timeit reduce(add ,[[1,2,3],[1,2],[1,4,5,6,7]]) 635 ns ± 4.38 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
benchmark:
:~$ python -m timeit "from itertools import chain;chain.from_iterable([[1,2,3],[1,2],[1,4,5,6,7]])" 1000000 loops, best of 3: 1.58 usec per loop :~$ python -m timeit "reduce(lambda x,y :x+y ,[[1,2,3],[1,2],[1,4,5,6,7]])" 1000000 loops, best of 3: 0.791 usec per loop :~$ python -m timeit "[j for i in [[1,2,3],[1,2],[1,4,5,6,7]] for j in i]" 1000000 loops, best of 3: 0.784 usec per loop
A benchmark on @Will's answer that used sum
(its fast for short list but not for long list) :
:~$ python -m timeit "sum([[1,2,3],[4,5,6],[7,8,9]], [])" 1000000 loops, best of 3: 0.575 usec per loop :~$ python -m timeit "sum([range(100),range(100)], [])" 100000 loops, best of 3: 2.27 usec per loop :~$ python -m timeit "reduce(lambda x,y :x+y ,[range(100),range(100)])" 100000 loops, best of 3: 2.1 usec per loop
For just a list like this, my favourite neat little trick is just to use sum
;
sum
has an optional argument: sum(iterable [, start])
, so you can do:
list_of_lists = [[1,2,3], [4,5,6], [7,8,9]] print sum(list_of_lists, []) # [1,2,3,4,5,6,7,8,9]
this works because the +
operator happens to be the concatenation operator for lists, and you've told it that the starting value is []
- an empty list.
but the documentaion for sum
advises that you use itertools.chain
instead, as it's much clearer.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With