Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python split for lists

Tags:

python

list

If we have a list of strings in python and want to create sublists based on some special string how should we do?

For instance:

l = ["data","more data","","data 2","more data 2","danger","","date3","lll"] p = split_special(l,"") 

would generate:

p = [["data","more data"],["data 2","more data 2","danger"],["date3","lll"]] 
like image 561
ppaulojr Avatar asked Jan 25 '13 20:01

ppaulojr


People also ask

Can you use split on a list in Python?

To split the elements of a list in Python: Use a list comprehension to iterate over the list. On each iteration, call the split() method to split each string. Return the part of each string you want to keep.

How do you split a list of elements in Python?

The split() method of the string class is fairly straightforward. It splits the string, given a delimiter, and returns a list consisting of the elements split out from the string. By default, the delimiter is set to a whitespace - so if you omit the delimiter argument, your string will be split on each whitespace.

How do you split a list into 5 in Python?

array_split() is a numpy method that splits a list into equal sized chunks. Here, the size of the chunk is 5.


2 Answers

itertools.groupby is one approach (as it often is):

>>> l = ["data","more data","","data 2","more data 2","danger","","date3","lll"] >>> from itertools import groupby >>> groupby(l, lambda x: x == "") <itertools.groupby object at 0x9ce06bc> >>> [list(group) for k, group in groupby(l, lambda x: x == "") if not k] [['data', 'more data'], ['data 2', 'more data 2', 'danger'], ['date3', 'lll']] 

We can even cheat a little because of this particular case:

>>> [list(group) for k, group in groupby(l, bool) if k] [['data', 'more data'], ['data 2', 'more data 2', 'danger'], ['date3', 'lll']] 
like image 93
DSM Avatar answered Sep 22 '22 23:09

DSM


One possible implementation using itertools

>>> l ['data', 'more data', '', 'data 2', 'more data 2', 'danger', '', 'date3', 'lll'] >>> it_l = iter(l) >>> from itertools import takewhile, dropwhile >>> [[e] + list(takewhile(lambda e: e != "", it_l)) for e in it_l if e != ""] [['data', 'more data'], ['data 2', 'more data 2', 'danger'], ['date3', 'lll']] 

Note*

This is as fast as using groupby

>>> stmt_dsm = """ [list(group) for k, group in groupby(l, lambda x: x == "") if not k] """ >>> stmt_ab = """ it_l = iter(l) [[e] + list(takewhile(lambda e: e != "", it_l)) for e in it_l if e != ""] """ >>> t_ab = timeit.Timer(stmt = stmt_ab, setup = "from __main__ import l, dropwhile, takewhile") >>> t_dsm = timeit.Timer(stmt = stmt_dsm, setup = "from __main__ import l, groupby") >>> t_ab.timeit(100000) 1.6863486541265047 >>> t_dsm.timeit(100000) 1.5298066765462863 >>> t_ab.timeit(100000) 1.735611326163962 >>>  
like image 34
Abhijit Avatar answered Sep 21 '22 23:09

Abhijit