Splitting a List Based on a Substring

Question

I have the following list:

['1(Reg)', '100', '103', '102', '100', '2(Reg)', '98', '101', '100', '3(Reg)', '96', '99', '98', '4(Reg)', '100', '100', '100', '100', '5(Reg)', '98', '99', '99', '100', '6(Reg)', '99.47', '99.86', '99.67', '100']

I want to split this list into multiple lists so that each sublist will have the substring "(Reg)" appear once:

[['1(Reg)', '100', '103', '102', '100'],
['2(Reg)', '98', '101', '100'],
['3(Reg)', '96', '99', '98'],
['4(Reg)', '100', '100', '100', '100'],
['5(Reg)', '98', '99', '99', '100'],
['6(Reg)', '99.47', '99.86', '99.67', '100']]

I've tried joining the list with a delimiter and splitting it by (Reg), but that didn't work. How can I split the list into a nested list like above?

DYZ · Accepted Answer

A slightly different (optimized) version of WVO's answer:

splitted = []

for item in l:
    if '(Reg)' in item:
        splitted.append([])
    splitted[-1].append(item)

#[['1(Reg)', '100', '103', '102', '100'], ['2(Reg)', '98', '101', '100'], 
# ['3(Reg)', '96', '99', '98'], ['4(Reg)', '100', '100', '100', '100'], 
# ['5(Reg)', '98', '99', '99', '100'], 
# ['6(Reg)', '99.47', '99.86', '99.67', '100']]

jpp · Answer

Here is one way, though not necessarily optimal:

from itertools import zip_longest

lst = ['1(Reg)', '100', '103', '102', '100', '2(Reg)', '98', '101', '100',
       '3(Reg)', '96', '99', '98', '4(Reg)', '100', '100', '100', '100',
       '5(Reg)', '98', '99', '99', '100', '6(Reg)', '99.47', '99.86', '99.67', '100']

indices = [i for i, j in enumerate(lst) if '(Reg)' in j]
lst_new = [lst[i:j] for i, j in zip_longest(indices, indices[1:])]

# [['1(Reg)', '100', '103', '102', '100'],
#  ['2(Reg)', '98', '101', '100'],
#  ['3(Reg)', '96', '99', '98'],
#  ['4(Reg)', '100', '100', '100', '100'],
#  ['5(Reg)', '98', '99', '99', '100'],
#  ['6(Reg)', '99.47', '99.86', '99.67', '100']]

Ajax1234 · Answer

You can use itertools.groupby with regular expressions:

import itertools
import re
s = ['1(Reg)', '100', '103', '102', '100', '2(Reg)', '98', '101', '100', '3(Reg)', '96', '99', '98', '4(Reg)', '100', '100', '100', '100', '5(Reg)', '98', '99', '99', '100', '6(Reg)', '99.47', '99.86', '99.67', '100']
new_data = [list(b) for _, b in itertools.groupby(s, key=lambda x:bool(re.findall('\d+\(', x)))]
final_data = [new_data[i]+new_data[i+1] for i in range(0, len(new_data), 2)]

Output:

[['1(Reg)', '100', '103', '102', '100'], 
 ['2(Reg)', '98', '101', '100'], 
 ['3(Reg)', '96', '99', '98'], 
 ['4(Reg)', '100', '100', '100', '100'], 
 ['5(Reg)', '98', '99', '99', '100'], 
 ['6(Reg)', '99.47', '99.86', '99.67', '100']]

Splitting a List Based on a Substring

Tags:

python

string

list

python-3.x

nested-lists

whackamadoodle3000

3 Answers

DYZ

jpp

Ajax1234

Recent Activity

Donate For Us

Splitting a List Based on a Substring

Tags:

python

string

list

python-3.x

nested-lists

whackamadoodle3000

3 Answers

DYZ

jpp

Ajax1234

Related questions

Recent Activity

Donate For Us