Split list into sublist based on part of value

I have a list:

L= ['v1_A', 'v1_B', 'v1_C', 'V2_A', 'V2_B', 'V2000_A']

and I want to split it into sublists so all values that contain V1 become one (sub?)list, all values that contain "V2", "V2000", etc.

The length and number of sublist can differ, but all are identified by the part before the underscore.

How do you split a list into equal Sublists in Python?

You could use numpy's array_split function e.g., np. array_split(np. array(data), 20) to split into 20 nearly equal size chunks. To make sure chunks are exactly equal in size use np.

If you want to group your strings by the initial value, you have two options:

Use itertools.groupby(); this makes grouping easy provided your data is already sorted on that first value:
```
from itertools import groupby

grouped = [list(g) for k, g in groupby(L, lambda s: s.partition('_')[0])]
```
The lambda here provides groupby() with the value to group on; it'll give you separate generators (assigned to g in the above code) that will yield values where the group key doesn't vary. As the lambda produces the first part of each string, that means the input is grouped on your v1, V2, V2000, etc. prefixes.
Use a dictionary to group items by the common prefix. Use this if your input is not sorted:
```
grouped = {}
for elem in L:
    key = elem.partition('_')[0]
    grouped.setdefault(key, []).append(elem)
grouped = grouped.values()
```
If you use Python 3, that last line would be grouped = list(grouped.values())

Both produce a nested list for each prefix, grouping all values by that prefix. Both use str.partition() to split off just the part before the first _ underscore.

Demo:

>>> from itertools import groupby
>>> L= ['v1_A', 'v1_B', 'v1_C', 'V2_A', 'V2_B', 'V2000_A']
>>> [list(g) for k, g in groupby(L, lambda s: s.partition('_')[0])]
[['v1_A', 'v1_B', 'v1_C'], ['V2_A', 'V2_B'], ['V2000_A']]
>>> grouped = {}
>>> for elem in L:
...     key = elem.partition('_')[0]
...     grouped.setdefault(key, []).append(elem)
... 
>>> grouped.values()
[['V2_A', 'V2_B'], ['V2000_A'], ['v1_A', 'v1_B', 'v1_C']]

L= ['v1_A', 'v1_B', 'v1_C', 'V2_A', 'V2_B', 'V2000_A']
new_L = []
for i in L:
    new_item = i.split('_')
    new_L.append(new_item)
print new_L

Output: [['v1', 'A'], ['v1', 'B'], ['v1', 'C'], ['V2', 'A'], ['V2', 'B'], ['V2000', 'A']]

Hope this gives you the desired result.

Split list into sublist based on part of value

Tags:

python

user3910001

People also ask

2 Answers

Martijn Pieters

Sesha

Recent Activity

Donate For Us

Split list into sublist based on part of value

Tags:

python

user3910001

People also ask

2 Answers

Martijn Pieters

Sesha

Related questions

Recent Activity

Donate For Us