I'm trying to filter a collection of strings by a "filter" list... a list of bad words. The string contains a word from the list I dont want it.
I've gotten so far, the bad Word here is "frakk":
string[] filter = { "bad", "words", "frakk" };
string[] foo =
{
"this is a lol string that is allowed",
"this is another lol frakk string that is not allowed!"
};
var items = from item in foo
where (item.IndexOf( (from f in filter select f).ToString() ) == 0)
select item;
But this aint working, why?
C# filter list with iteration. In the first example, we use a foreach loop to filter a list. var words = new List<string> { "sky", "rock", "forest", "new", "falcon", "jewelry" }; var filtered = new List<string>(); foreach (var word in words) { if (word. Length == 3) { filtered.
LINQ can be used to query and transform strings and collections of strings. It can be especially useful with semi-structured data in text files. LINQ queries can be combined with traditional string functions and regular expressions. For example, you can use the String.
Filtering operators are those operators which are used to filter the data according to the user requirement from the given data source or from the given sequence. For example, in an employee record, we want to get the data of the employees whose age in 21.
String. Equals() method is a method of String class. This method takes two strings to be compared as parameters. It returns a logical value, true or false with the help of which we can determine whether the given strings are the same or not.
You can use Any
+ Contains
:
var items = foo.Where(s => !filter.Any(w => s.Contains(w)));
if you want to compare case-insensitively:
var items = foo.Where(s => !filter.Any(w => s.IndexOf(w, StringComparison.OrdinalIgnoreCase) >= 0));
Update: If you want to exlude sentences where at least one word is in the filter-list you can use String.Split()
and Enumerable.Intersect
:
var items = foo.Where(sentence => !sentence.Split().Intersect(filter).Any());
Enumerable.Intersect
is very efficient since it uses a Set
under the hood. it's more efficient to put the long sequence first. Due to Linq's deferred execution is stops on the first matching word.
( note that the "empty" Split
includes other white-space characters like tab or newline )
The first problem you need to solve is breaking up the sentence into a series of words. The simplest way to do this is based on spaces
string[] words = sentence.Split(new char[] {' '}, StringSplitOptions.RemoveEmptyEntries);
From there you can use a simple LINQ expression to find the profanities
var badWords = words.Where(x => filter.Contains(x));
However this is a bit of a primitive solution. It won't handle a number of complex cases that you likely need to think about
' '
dog!
won't be viewed as dog
. Probably much better to break up words on legal characters If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With