I don't have much experience with RegEx so I am using many chained String.Replace() calls to remove unwanted characters -- is there a RegEx I can write to streamline this?
string messyText = GetText();
string cleanText = messyText.Trim()
.ToUpper()
.Replace(",", "")
.Replace(":", "")
.Replace(".", "")
.Replace(";", "")
.Replace("/", "")
.Replace("\\", "")
.Replace("\n", "")
.Replace("\t", "")
.Replace("\r", "")
.Replace(Environment.NewLine, "")
.Replace(" ", "");
Thanks
Try this regex:
Regex regex = new Regex(@"[\s,:.;/\\]+");
string cleanText = regex.Replace(messyText, "").ToUpper();
\s
is a character class equivalent to [ \t\r\n]
.
If you just want to preserve alphanumeric characters, instead of adding every non-alphanumeric character in existence to the character class, you could do this:
Regex regex = new Regex(@"[\W_]+");
string cleanText = regex.Replace(messyText, "").ToUpper();
Where \W
is any non-word character (not [^a-zA-Z0-9_]
).
Character classes to the rescue!
string messyText = GetText();
string cleanText = Regex.Replace(messyText.Trim().ToUpper(), @"[,:.;/\\\n\t\r ]+", "")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With