Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Combining regular expressions in Python - \W and \S

I want my code to only return the special characters [".", "*", "=", ","] I want to remove all digits/alphabetical characters ("\W") and all white spaces ("\S")

import re

original_string = "John is happy. He owns 3*4=12, apples"
new_string = re.findall("\W\S",original_string)
print(new_string)

But instead I get this as my output: [' i', ' h', ' H', ' o', ' 3', '*4', '=1', ' a']

I have absolutely no idea why this happens. Hence I have two questions:

1) Is it possible to achieve my goal using regular expressions

2) What is actually going on with my code?

like image 417
EML Avatar asked Feb 16 '26 17:02

EML


2 Answers

You were close, but you need to specify these escape sequences inside a character class.

re.findall(r'[^\w\s]', original_string)
# ['.', '*', '=', ',']

Note that the caret ^ indicates negation (i.e., don't match these characters).

Alternatively, instead of removing what you don't need, why not extract what you do?

re.findall(r'[.*=,]', original_string)
# ['.', '*', '=', ',']
like image 116
cs95 Avatar answered Feb 19 '26 13:02

cs95


The regular expression \W\S matches a sequence of two characters; one non-word, and one non-space. If you want to combine them, that's [^\w\s] which matches one character which does not belong to either the word or the whitespace group.

However, there are many characters which are not one of the ones you enumerate which match this expression. If you want to remove characters which are not in your set, the character class containing exactly all those characters is simply [^.*=,]

Perhaps it's worth noting that inside [...] you don't need to (and in fact should not) backslash-escape e.g. the literal dot. By default, a character class cannot match a newline character, though there is an option re.DOTALL to change this.

If you are trying to extract and parse numerical expressions, regex can be a useful part of the lexical analysis, but you really want a proper parser.

like image 31
tripleee Avatar answered Feb 19 '26 11:02

tripleee



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!