Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How replace two or more repeated :punct: using re in python?

I need to replace two or more repeated punctuation for space on some string.

"asdasdasd - adasdasd asda ------- asda wadsda +-----+ wwww qqqqqq aaaaa"

to

"asdasdasd - adasdasd asda -  asda wadsda +- + wwww qqqqqq aaaaa"

Using regex101 app I've created this one:

https://regex101.com/r/vdR5T1/1/

But when I tried on python:

import re
texto = "asdasdasd - adasdasd asda ------- asda wadsda +-----+ wwww qqqqqq aaaaa"
rx = re.compile(r'([[:punct:]])\1{2,}')
texto = rx.sub(' ', texto)
print(texto)

I've got this error:

FutureWarning: Possible nested set at position 2
  rx = re.compile(r'([[:punct:]])\1{2,}')

How can I run this (or a similar) regex using python?

like image 504
celsowm Avatar asked Oct 27 '25 04:10

celsowm


1 Answers

Python re doesn't recognise POSIX bracket expressions, so [[:punct:]] looks like a nested character class (hence the warning message). You can replace it with a character class which contains all punctuation characters e.g. [!-/:-@[-`{-~]. Note that your regex requires 3 or more of the same character (the initial capture group plus 2 or more repetitions), you just want + instead of {2,} and you need to replace with \1 to get the repeated character once in the output:

import re
texto = "asdasdasd - adasdasd asda ------- asda wadsda +-----+ wwww -- qqqqqq aaaaa"
rx = re.compile(r'([!-/:-@[-`{-~])\1+')
texto = rx.sub(r'\1 ', texto)
print(texto)

Output:

asdasdasd - adasdasd asda -  asda wadsda +- + wwww -  qqqqqq aaaaa
like image 132
Nick Avatar answered Oct 30 '25 04:10

Nick



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!