Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex to remove characters and supplied words

Tags:

c#

regex

I need a regex that will allow only alphanumeric characters AND also remove certain full-words.

Example:

Input string: this-is-johny-bravo's-grand-dad

Result string: johny-bravos-dad

Words/characters to replace by an empty string: this,is,',grand

Here is what I have so far:

var input = "this-is-johny-bravo's-grand-dad";
var regex = new Regex(@"([^a-z0-9\-][\b(this|is|grand)\b]?)");
var result = regex.Replace(input, "");

The result seems to not have the apostrophe but unfortunately still includes the rejected full-words.

like image 736
8 revs, 8 users 48% Avatar asked Dec 10 '25 23:12

8 revs, 8 users 48%


2 Answers

You also need to add the character class to alternation:

new Regex(@"\b(this|is|grand)\b-?|[^a-z0-9-]");
like image 154
Rohit Jain Avatar answered Dec 12 '25 12:12

Rohit Jain


Your expression is too complicated. Try

\b(this|is|grand|')\b-?

Also, and that is the root cause of your problem: Character classes are not for alternation. This [\b(this|is|grand)\b] is syntactically equivalent to this [()adghinrst|].

Thinking about it, you probably want this:

(\b(this|is|grand)\b|[^a-z0-9-])-?

Break-down:

(                          # group 1
    \b(this|is|grand)\b    #   any of these words
    |                      #   or 
    [^a-z0-9-]             #   any character except one of these
)                          # end group 1
-?                         # optional dash at the end
like image 22
Tomalak Avatar answered Dec 12 '25 11:12

Tomalak



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!