i have a list of words:
string[] BAD_WORDS = { "xxx", "o2o" } // My list is actually a lot bigger about 100 words
and i have some text (usually short , max 250 words), which i need to REMOVE all the BAD_WORDS
in it.
i have tried this:
foreach (var word in BAD_WORDS)
{
string w = string.Format(" {0} ", word);
if (input.Contains(w))
{
while (input.Contains(w))
{
input = input.Replace(w, " ");
}
}
}
but, if the text starts or ends with a bad word, it will not be removed. i did it with the spaces, so it will not match partial words for example "oxxx" should not be removed, since it is not an exact match to the BAD WORDS.
anyone can give me advise on this?
string cleaned = Regex.Replace(input, "\\b" + string.Join("\\b|\\b",BAD_WORDS) + "\\b", "")
This is a great task for Linq, and also the Split method. Try this:
return string.Join(" ", input.Split(' ').Where(w => !BAD_WORDS.Contains(w)));
You could use StartWith and EndsWith methods like:
while (input.Contains(w) || input.StartsWith(w) || input.EndsWith(w) || input.IndexOf(w) > 0)
{
input = input.Replace(w, " ");
}
Hope this will fix your problem.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With