How to split this string with python?

Question

I have strings that look like this example: "AAABBBCDEEEEBBBAA"

Any character is possible in the string.

I want to split it to a list like: ['AAA','BBB','C','D','EEEE','BBB','AA']

so every continuous stretch of the same characters goes to separate element of the split list.

I know that I can iterate over characters in the string, check every i and i-1 pair if they contain the same character, etc. but is there a more simple solution out there?

kennytm · Accepted Answer

We could use Regex:

>>> import re
>>> r = re.compile(r'(.)\1*')
>>> [m.group() for m in r.finditer('AAABBBCDEEEEBBBAA')]
['AAA', 'BBB', 'C', 'D', 'EEEE', 'BBB', 'AA']

Alternatively, we could use itertools.groupby.

>>> import itertools
>>> [''.join(g) for k, g in itertools.groupby('AAABBBCDEEEEBBBAA')]
['AAA', 'BBB', 'C', 'D', 'EEEE', 'BBB', 'AA']

timeit shows Regex is faster (for this particular string) (Python 2.6, Python 3.1). But Regex is after all specialized for string, and groupby is a generic function, so this is not so unexpected.

How to split this string with python?

Tags:

python

string

split

jan

1 Answers

kennytm

Recent Activity

Donate For Us

How to split this string with python?

Tags:

python

string

split

jan

1 Answers

kennytm

Related questions

Recent Activity

Donate For Us