Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

regular expression split : FutureWarning: split() requires a non-empty pattern match

I am getting a warning in Python 3 version when I use split() command as follows:

pattern = re.compile(r'\s*')
match = re.split(pattern, 'I am going to school')
print(match)

python3.6/re.py:212: FutureWarning: split() requires a non-empty pattern match. return _compile(pattern, flags).split(string, maxsplit)

I don't understand why I am getting this warning.

like image 699
pavikirthi Avatar asked Nov 30 '17 01:11

pavikirthi


1 Answers

You are getting this warning because with the \s* pattern you asked to split on substrings of zero or more whitespaces

But... the empty string matches that pattern, because there are zero whitespaces in it!

It's unclear what re.split should do with this. This is what str.split does:

>>> 'hello world'.split('')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: empty separator
>>>

re.split decides to just throw away that empty substring option, and instead splits on one or more whitespaces. In python3.6 it emits that FutureWarning you're seeing, to tell you about that decision.

You could say that yourself by replacing * with +:

$ python3.6 -c "import re; print(re.split('\s*', 'I am going to school'))"
/usr/lib64/python3.6/re.py:212: FutureWarning: split() requires a non-empty pattern match.
  return _compile(pattern, flags).split(string, maxsplit)
['I', 'am', 'going', 'to', 'school']

$ python3.6 -c "import re; print(re.split('\s+', 'I am going to school'))"
['I', 'am', 'going', 'to', 'school']
like image 142
azhrei Avatar answered Sep 24 '22 12:09

azhrei