Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex punctuation split with Python

Can anyone help me a bit with regexs? I currently have this: re.split(" +", line.rstrip()), which separates by spaces.

How could I expand this to cover punctuation, too?

like image 743
dantdj Avatar asked Sep 09 '25 16:09

dantdj


1 Answers

The official Python documentation has a good example for this one. It will split on all non-alphanumeric characters (whitespace and punctuation). Literally \W is the character class for all Non-Word characters. Note: the underscore "_" is considered a "word" character and will not be part of the split here.

re.split('\W+', 'Words, words, words.')

See https://docs.python.org/3/library/re.html for more examples, search page for "re.split"

like image 86
Mister_Tom Avatar answered Sep 12 '25 06:09

Mister_Tom