Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python string splitting

I have an input string like this: a1b2c30d40 and I want to tokenize the string to: a, 1, b, 2, c, 30, d, 40.

I know I can read each character one by one and keep track of the previous character to determine if I should tokenize it or not (2 digits in a row means don't tokenize it) but is there a more pythonic way of doing this?

like image 484
Hery Avatar asked Jan 30 '11 16:01

Hery


People also ask

How do you split a string in Python?

The split() method splits a string into a list. You can specify the separator, default separator is any whitespace. Note: When maxsplit is specified, the list will contain the specified number of elements plus one.

How do you split a string into multiple strings in Python?

Split String in Python. To split a String in Python with a delimiter, use split() function. split() function splits the string into substrings and returns them as an array.

How do you split a string into 3 parts in Python?

Python 3 - String split() Method The split() method returns a list of all the words in the string, using str as the separator (splits on all whitespace if left unspecified), optionally limiting the number of splits to num.


1 Answers

>>> re.split(r'(\d+)', 'a1b2c30d40')
['a', '1', 'b', '2', 'c', '30', 'd', '40', '']

On the pattern: as the comment says, \d means "match one digit", + is a modifier that means "match one or more", so \d+ means "match as much digits as possible". This is put into a group (), so the entire pattern in context of re.split means "split this string using as much digits as possible as the separator, additionally capturing matched separators into the result". If you'd omit the group, you'd get ['a', 'b', 'c', 'd', ''].

like image 145
Cat Plus Plus Avatar answered Sep 19 '22 17:09

Cat Plus Plus