Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to split a list into subsets based on a pattern?

Tags:

python

I'm doing this but it feels this can be achieved with much less code. It is Python after all. Starting with a list, I split that list into subsets based on a string prefix.

# Splitting a list into subsets
# expected outcome:
# [['sub_0_a', 'sub_0_b'], ['sub_1_a', 'sub_1_b']]

mylist = ['sub_0_a', 'sub_0_b', 'sub_1_a', 'sub_1_b']

def func(l, newlist=[], index=0):
    newlist.append([i for i in l if i.startswith('sub_%s' % index)])
    # create a new list without the items in newlist
    l = [i for i in l if i not in newlist[index]]

    if len(l):
        index += 1
        func(l, newlist, index)

func(mylist)
like image 224
droidballoon Avatar asked Nov 13 '12 21:11

droidballoon


People also ask

How do I split a list into a list?

With split In this approach we use the split function to extract each of the elements as they are separated by comma. We then keep appending this element as a list into the new created list.

How would you split a list into evenly sized chunks?

The easiest way to split list into equal sized chunks is to use a slice operator successively and shifting initial and final position by a fixed number.

How do you split a list by delimiter in Python?

Split by delimiter: split()Use split() method to split by delimiter. If the argument is omitted, it will be split by whitespace, such as spaces, newlines \n , and tabs \t . Consecutive whitespace is processed together. A list of the words is returned.


2 Answers

You could use itertools.groupby:

>>> import itertools
>>> mylist = ['sub_0_a', 'sub_0_b', 'sub_1_a', 'sub_1_b']
>>> for k,v in itertools.groupby(mylist,key=lambda x:x[:5]):
...     print k, list(v)
... 
sub_0 ['sub_0_a', 'sub_0_b']
sub_1 ['sub_1_a', 'sub_1_b']

or exactly as you specified it:

>>> [list(v) for k,v in itertools.groupby(mylist,key=lambda x:x[:5])]
[['sub_0_a', 'sub_0_b'], ['sub_1_a', 'sub_1_b']]

Of course, the common caveats apply (Make sure your list is sorted with the same key you're using to group), and you might need a slightly more complicated key function for real world data...

like image 186
mgilson Avatar answered Nov 15 '22 17:11

mgilson


In [28]: mylist = ['sub_0_a', 'sub_0_b', 'sub_1_a', 'sub_1_b']

In [29]: lis=[]

In [30]: for x in mylist:
    i=x.split("_")[1]
    try:
        lis[int(i)].append(x)
    except:    
        lis.append([])
        lis[-1].append(x)
   ....:         

In [31]: lis
Out[31]: [['sub_0_a', 'sub_0_b'], ['sub_1_a', 'sub_1_b']]
like image 40
Ashwini Chaudhary Avatar answered Nov 15 '22 18:11

Ashwini Chaudhary