Difference between re.split(" ", string) and re.split("\s+", string)?

Question

I'm currently studying regular expressions and have come across an inquiry. So the title of the question is what I'm trying to find out. I thought since \s represents a white space, re.split(" ", string) and re.split("\s+", string) would give out same values, as shown next:

>>> import re
>>> a = re.split(" ", "Why is this wrong")
>>> a
["Why", "is", "this", "wrong"]

>>> import re
>>> a = re.split("\s+", "Why is this wrong")
>>> a
["Why", "is", "this", "wrong"]

These two give out the same answers so I thought that they were the same thing. However, it turns out that these are different. In what case would it be different? And what am I missing here that is blinding me?

Patrick Artner · Accepted Answer

This only look similar based on your example.

A split on ' ' (a single space) does exactly that - it splits on a single space. Consecutive spaces will lead to empty "matches" when you split.

A split on '\s+' will also split on multiple occurences of those characters and it includes other whitespaces then "pure spaces":

import re

a = re.split(" ", "Why    is this  	 	  wrong")
b = re.split("\s+", "Why    is this  	 	  wrong")

print(a)
print(b)

Output:

# re.split(" ",data)
['Why', '', '', '', 'is', 'this', '', '	', '	', '', 'wrong']

# re.split("\s+",data)
['Why', 'is', 'this', 'wrong']

Documentation:

\s
Matches any whitespace character; this is equivalent to the class [ \f\v]. (https://docs.python.org/3/howto/regex.html#matching-characters)

Difference between re.split(" ", string) and re.split("\s+", string)?

Tags:

python

split

python-re

Sihwan Lee

1 Answers

Patrick Artner

Recent Activity

Donate For Us

Difference between re.split(" ", string) and re.split("\s+", string)?

Tags:

python

split

python-re

Sihwan Lee

1 Answers

Patrick Artner

Related questions

Recent Activity

Donate For Us