Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

re.split() gives empty elements in list

Tags:

python

regex

please help with this case:

m = re.split('([A-Z][a-z]+)', 'PeopleRobots')
print (m)

Result:

['', 'People', '', 'Robots', '']

Why does the list have empty elements?

like image 794
Vasia Pupkin Avatar asked Aug 08 '13 10:08

Vasia Pupkin


People also ask

Does split work on lists?

Definition and Usage. The split() method splits a string into a list. You can specify the separator, default separator is any whitespace.

Does Split always return a list?

split() Return ValueThe return value of the split() method is always a list of strings obtained after breaking the given string by the specified separator.

How does re split work?

The re. split() function splits the given string according to the occurrence of a particular character or pattern. Upon finding the pattern, this function returns the remaining characters from the string in a list.

What happens when you split an empty string?

The split() method does not change the value of the original string. If the delimiter is an empty string, the split() method will return an array of elements, one element for each character of string. If you specify an empty string for string, the split() method will return an empty string and not an array of strings.


1 Answers

According to re.split documentation:

If there are capturing groups in the separator and it matches at the start of the string, the result will start with an empty string. The same holds for the end of the string:

If you want to get People and Robots, use re.findall:

>>> re.findall('([A-Z][a-z]+)', 'PeopleRobots')
['People', 'Robots']

You can omit grouping:

>>> re.findall('[A-Z][a-z]+', 'PeopleRobots')
['People', 'Robots']
like image 62
falsetru Avatar answered Nov 10 '22 00:11

falsetru