Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to replace all \W (none letters) with exception of '-' (dash) with regular expression?

I want replace all \W not letters with exception of - dash to spaces i.e:

  1. black-white will give black-white
  2. black#white will give black white

I know regular expression very well but I have no idea how to deal with it.

Consider that I want use Unicode so [a-zA-Z] is not \w like in English only. Consider that I prefer Python re syntax but can read other suggestions.

like image 682
Chameleon Avatar asked Dec 21 '14 10:12

Chameleon


2 Answers

Using negated character class: (\W is equivalent to [^\w]; [^-\w] => \W except -)

>>> re.sub(r'[^-\w]', ' ', 'black-white')
'black-white'
>>> re.sub(r'[^-\w]', ' ', 'black#white')
'black white'

If you use regex package, you can use nested sets, set operations:

>>> import regex
>>> print regex.sub(r'(?V1)[\W--[-]]', ' ', 'black-white')
black-white
>>> print regex.sub(r'(?V1)[\W--[-]]', ' ', 'black#white')
black white
like image 160
falsetru Avatar answered Oct 05 '22 14:10

falsetru


I would use negative lookahead like below,

>>> re.sub(r'(?!-)\W', r' ', 'black-white')
'black-white'
>>> re.sub(r'(?!-)\W', r' ', 'black#white')
'black white'

(?!-)\W the negative lookahead at the start asserts that the character we are going to match would be any from the \W (non-word character list) but not of hyphen - . It's like a kind of substraction, that is \W - character present inside the negative lookahead (ie. hyphen).

DEMO

like image 25
Avinash Raj Avatar answered Oct 05 '22 13:10

Avinash Raj