Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I compare a string to a "filter" list in linq?

Tags:

c#

linq

I'm trying to filter a collection of strings by a "filter" list... a list of bad words. The string contains a word from the list I dont want it.

I've gotten so far, the bad Word here is "frakk":

string[] filter = { "bad", "words", "frakk" };

string[] foo = 
{ 
    "this is a lol string that is allowed", 
    "this is another lol frakk string that is not allowed!"
};

var items = from item in foo 
            where (item.IndexOf( (from f in filter select f).ToString() ) == 0)
            select item;

But this aint working, why?

like image 770
Jason94 Avatar asked Jul 26 '13 20:07

Jason94


People also ask

How to filter a list in C# LINQ?

C# filter list with iteration. In the first example, we use a foreach loop to filter a list. var words = new List<string> { "sky", "rock", "forest", "new", "falcon", "jewelry" }; var filtered = new List<string>(); foreach (var word in words) { if (word. Length == 3) { filtered.

Can you use LINQ on a string?

LINQ can be used to query and transform strings and collections of strings. It can be especially useful with semi-structured data in text files. LINQ queries can be combined with traditional string functions and regular expressions. For example, you can use the String.

What is the LINQ query operator used to filter data?

Filtering operators are those operators which are used to filter the data according to the user requirement from the given data source or from the given sequence. For example, in an employee record, we want to get the data of the employees whose age in 21.

Can we compare two strings in C#?

String. Equals() method is a method of String class. This method takes two strings to be compared as parameters. It returns a logical value, true or false with the help of which we can determine whether the given strings are the same or not.


2 Answers

You can use Any + Contains:

var items = foo.Where(s => !filter.Any(w => s.Contains(w)));

if you want to compare case-insensitively:

var items = foo.Where(s => !filter.Any(w => s.IndexOf(w, StringComparison.OrdinalIgnoreCase) >= 0));

Update: If you want to exlude sentences where at least one word is in the filter-list you can use String.Split() and Enumerable.Intersect:

var items = foo.Where(sentence => !sentence.Split().Intersect(filter).Any());

Enumerable.Intersect is very efficient since it uses a Set under the hood. it's more efficient to put the long sequence first. Due to Linq's deferred execution is stops on the first matching word.

( note that the "empty" Split includes other white-space characters like tab or newline )

like image 185
Tim Schmelter Avatar answered Oct 10 '22 01:10

Tim Schmelter


The first problem you need to solve is breaking up the sentence into a series of words. The simplest way to do this is based on spaces

string[] words = sentence.Split(new char[] {' '}, StringSplitOptions.RemoveEmptyEntries);

From there you can use a simple LINQ expression to find the profanities

var badWords = words.Where(x => filter.Contains(x));

However this is a bit of a primitive solution. It won't handle a number of complex cases that you likely need to think about

  • There are many characters which qualify as a space. My solution only uses ' '
  • The split doesn't handle punctuations. So dog! won't be viewed as dog. Probably much better to break up words on legal characters
like image 33
JaredPar Avatar answered Oct 10 '22 01:10

JaredPar