I'm really bad with Regex but I want to remove all these .,;:'"$#@!?/*&^-+ out of a string
string x = "This is a test string, with lots of: punctuations; in it?!.";
How can I do that ?
One of the easiest ways to remove punctuation from a string in Python is to use the str. translate() method. The translate() method typically takes a translation table, which we'll do using the . maketrans() method.
We can use the JavaScript string replace method with a regex that matches the patterns in a string that we want to replace. So we can use it to remove punctuation by matching the punctuation and replacing them all with empty strings.
To remove punctuation with Python Pandas, we can use the DataFrame's str. replace method. We call replace with a regex string that matches all punctuation characters and replace them with empty strings. replace returns a new DataFrame column and we assign that to df['text'] .
Some punctuation has special meaning in RegEx. It can get confusing if you are searching for things question marks, periods, and parentheses. For example, a period means “match any character.” The easiest way to get around this is to “escape” the character.
First, please read here for information on regular expressions. It's worth learning.
You can use this:
Regex.Replace("This is a test string, with lots of: punctuations; in it?!.", @"[^\w\s]", "");
Which means:
[ #Character block start. ^ #Not these characters (letters, numbers). \w #Word characters. \s #Space characters. ] #Character block end.
In the end it reads "replace any character that is not a word character or a space character with nothing."
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With