I know how to split a list into even groups, but I'm having trouble splitting it into uneven groups. Essentially here is what I have: some list, let's call it <code>mylist</code>, that contains x elements. I also have another file, lets call it second_list, that looks something like this: <pre class="prettyprint"><code>{2, 4, 5, 9, etc.} </code></pre> Now what I want to do is divide <code>mylist</code> into uneven groups by the spacing in second_list. So, I want my first group to be the first 2 elements of <code>mylist</code>, the second group to be the next 4 elements of <code>mylist</code>, the third group to be the next 5 elements of <code>mylist</code>, the fourth group to be the next 9 elements of `mylist, and so on. Is there some easy way to do this? I tried doing something similar to if you want to split it into even groups: <pre class="prettyprint"><code>for j in range(0, len(second_list)): for i in range(0, len(mylist), second_list[j]): chunk_mylist = mylist[i:i+second_list[j]] </code></pre> However this doesn't split it like I want it to. I want to end up with my # of sublists being <code>len(second_list)</code>, and also split correctly, and this gives a lot more than that (and also splits incorrectly).

You can create an iterator and itertools.islice: <pre class="prettyprint"><code>mylist = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12] seclist = [2,4,6] from itertools import islice it = iter(mylist) sliced =[list(islice(it, 0, i)) for i in seclist] </code></pre> Which would give you: <pre class="prettyprint"><code>[[1, 2], [3, 4, 5, 6], [7, 8, 9, 10, 11, 12]] </code></pre> Once i elements are consumed they are gone so we keep getting the next i elements. Not sure what should happen with any remaining elements, if you want them added, you could add something like: <pre class="prettyprint"><code>mylist = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 ,14] seclist = [2, 4, 6] from itertools import islice it = iter(mylist) slices = [sli for sli in (list(islice(it, 0, i)) for i in seclist)] remaining = list(it) if remaining: slices.append(remaining) print(slices) </code></pre> Which would give you: <pre class="prettyprint"><code> [[1, 2], [3, 4, 5, 6], [7, 8, 9, 10, 11, 12], [13, 14]] </code></pre> Or in contrast if there were not enough, you could use a couple of approaches to remove empty lists, one an inner generator expression: <pre class="prettyprint"><code>from itertools import islice it = iter(mylist) slices = [sli for sli in (list(islice(it, 0, i)) for i in seclist) if sli] </code></pre> Or combine with itertools.takewhile: <pre class="prettyprint"><code> from itertools import islice, takewhile it = iter(mylist) slices = list(takewhile(bool, (list(islice(it, 0, i)) for i in seclist))) </code></pre> Which for: <pre class="prettyprint"><code>mylist = [1, 2, 3, 4, 5, 6] seclist = [2, 4, 6,8] </code></pre> would give you: <pre class="prettyprint"><code>[[1, 2], [3, 4, 5, 6]] </code></pre> As opposed to: <pre class="prettyprint"><code>[[1, 2], [3, 4, 5, 6], [], []] </code></pre> What you use completely depends on your possible inouts and how you would like to handle the various possibilities.

A numpythonic approach: <pre class="prettyprint"><code>>>> lst = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11] >>> sec = [2, 4, 5] >>> np.split(lst, np.cumsum(sec)) [array([0, 1]), array([2, 3, 4, 5]), array([ 6, 7, 8, 9, 10]), array([11])] </code></pre> And here is a Python3.X approach using <code>itertool.accumulate()</code>: <pre class="prettyprint"><code>>>> lst = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11] >>> sec = [2,4,6] >>> from itertools import accumulate >>> sec = list(accumulate(sec_lst)) >>> sec = [0] + sec + [None] if sec[0] != 0 else sec + [None] >>> >>> [lst[i:j] for i, j in zip(sec, sec[1:])] [[0, 1], [2, 3, 4, 5], [6, 7, 8, 9, 10], [11]] </code></pre>

Using list-comprehensions together with slicing and <code>sum()</code> function (all basic and built-in tools of python): <pre class="prettyprint"><code>mylist = [1,2,3,4,5,6,7,8,9,10] seclist = [2,4,6] [mylist[sum(seclist[:i]):sum(seclist[:i+1])] for i in range(len(seclist))] #output: [[1, 2], [3, 4, 5, 6], [7, 8, 9, 10]] </code></pre> <hr> If <code>seclist</code> is very long and you wish to be more efficient use <code>numpy.cumsum()</code> first: <pre class="prettyprint"><code>import numpy as np cumlist = np.hstack((0, np.cumsum(seclist))) [mylist[cumlist[i]:cumlist[i+1]] for i in range(len(cumlist)-1)] </code></pre> and get the same results

Splitting a list into uneven groups?

Tags:

python

list

split

python-2.7

sublist

I know how to split a list into even groups, but I'm having trouble splitting it into uneven groups.

Essentially here is what I have: some list, let's call it mylist, that contains x elements.

I also have another file, lets call it second_list, that looks something like this:

{2, 4, 5, 9, etc.}

Now what I want to do is divide mylist into uneven groups by the spacing in second_list. So, I want my first group to be the first 2 elements of mylist, the second group to be the next 4 elements of mylist, the third group to be the next 5 elements of mylist, the fourth group to be the next 9 elements of `mylist, and so on.

Is there some easy way to do this? I tried doing something similar to if you want to split it into even groups:

for j in range(0, len(second_list)):
    for i in range(0, len(mylist), second_list[j]):
        chunk_mylist = mylist[i:i+second_list[j]]

However this doesn't split it like I want it to. I want to end up with my # of sublists being len(second_list), and also split correctly, and this gives a lot more than that (and also splits incorrectly).

554

asked Aug 09 '16 22:08

J. P.

4 Answers

You can create an iterator and itertools.islice:

mylist = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
seclist = [2,4,6]

from itertools import islice
it = iter(mylist)

sliced =[list(islice(it, 0, i)) for i in seclist]

Which would give you:

[[1, 2], [3, 4, 5, 6], [7, 8, 9, 10, 11, 12]]

Once i elements are consumed they are gone so we keep getting the next i elements.

Not sure what should happen with any remaining elements, if you want them added, you could add something like:

mylist = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 ,14]
seclist = [2, 4, 6]

from itertools import islice

it = iter(mylist)

slices = [sli for sli in (list(islice(it, 0, i)) for i in seclist)]
remaining = list(it)
if remaining:
    slices.append(remaining)
print(slices)

Which would give you:

 [[1, 2], [3, 4, 5, 6], [7, 8, 9, 10, 11, 12], [13, 14]]

Or in contrast if there were not enough, you could use a couple of approaches to remove empty lists, one an inner generator expression:

from itertools import islice

it = iter(mylist)
slices = [sli for sli in (list(islice(it, 0, i)) for i in seclist) if sli]

Or combine with itertools.takewhile:

 from itertools import islice, takewhile

it = iter(mylist)
slices = list(takewhile(bool, (list(islice(it, 0, i)) for i in seclist)))

Which for:

mylist = [1, 2, 3, 4, 5, 6]
seclist = [2, 4, 6,8]

would give you:

[[1, 2], [3, 4, 5, 6]]

As opposed to:

[[1, 2], [3, 4, 5, 6], [], []]

What you use completely depends on your possible inouts and how you would like to handle the various possibilities.

139

answered Sep 19 '22 12:09

Padraic Cunningham

A numpythonic approach:

>>> lst = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
>>> sec = [2, 4, 5]
>>> np.split(lst, np.cumsum(sec))
[array([0, 1]), array([2, 3, 4, 5]), array([ 6,  7,  8,  9, 10]), array([11])]

And here is a Python3.X approach using itertool.accumulate():

>>> lst = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
>>> sec = [2,4,6]
>>> from itertools import accumulate
>>> sec = list(accumulate(sec_lst))
>>> sec = [0] + sec + [None] if sec[0] != 0 else sec + [None]
>>> 
>>> [lst[i:j] for i, j in zip(sec, sec[1:])]
[[0, 1], [2, 3, 4, 5], [6, 7, 8, 9, 10], [11]]

answered Sep 20 '22 12:09

Mazdak

Using list-comprehensions together with slicing and sum() function (all basic and built-in tools of python):

mylist = [1,2,3,4,5,6,7,8,9,10]
seclist = [2,4,6]

[mylist[sum(seclist[:i]):sum(seclist[:i+1])] for i in range(len(seclist))]

#output:
[[1, 2], [3, 4, 5, 6], [7, 8, 9, 10]]

If seclist is very long and you wish to be more efficient use numpy.cumsum() first:

import numpy as np
cumlist = np.hstack((0, np.cumsum(seclist)))
[mylist[cumlist[i]:cumlist[i+1]] for i in range(len(cumlist)-1)]

and get the same results

answered Sep 18 '22 12:09

Ohad Eytan

This solution keeps track of how many items you've written. It will crash if the sum of the numbers in the second_list is longer than mylist

total = 0
listChunks = []
for j in range(len(second_list)):
    chunk_mylist = mylist[total:total+second_list[j]]
    listChunks.append(chunk_mylist)
    total += second_list[j]

After running this, listChunks is a list containing sublists with the lengths found in second_list.