Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LINQ string[] against multiple fields

Tags:

c#

linq

Lets say I have a table dataContext.Customer with the following fields

    FName    varchar
    LName    varchar
    Phone    varchar
    DOB      datetime
    Address  varchar

The table is filled with some sample data, lets say:

    John | Smith | 3051112222 | 01/01/1978 | Roosevelt Av 787
    Aron | Frank | 7871112222 | 01/01/1979 | Lambda Street 305
    Dick | Bush  | 9512221111 | 01/01/1980 | John Street 1
    John | Allen | 7872222222 | 01/01/1981 | Liberty Av 555

We also have a string array with an arbitrary number of elements, for example:

    search[0] = "1978"
    search[1] = "John"

I need a LINQ query that will compare each field of the table against each item in the string array incrementally using "contains" or "any" (meaning LIKE in SQL) and only return the rows that match all the given conditions in a record, based on the previous search[] example the LINQ query should return only record #1.

Another example can be:

    search[0] = "Bush"
    search[1] = "111"
    search[2] = "John"

and only record #3 should by returned. Finally for:

    search[0] = "John"

Records #1, #3 and #4 should be returned (I think the idea is clear)

There's a question on how to compare a string[] against field in: LINQ: Entity string field contains any of an array of strings

If the answer is a 50 lines C# routine I prefer to solve this directly in the database via stored procedure.

It will be awesome if there is some kind of "reflection" trick to iterate all the field on dataContext.Customers while doing the query (obviously the real table don't have 5 fields).

Performance is not a problem.

I'm pretty sure this cannot be done in a single LINQ line because of the logic needed for the multiple match, but it never hurt to ask, much less to learn anything new :)

UPDATE: Ok, here's a simple SQL code that will accomplish the task. Note that I have cut the amount of search variables to just 2 for clarity. In the real life scenario we can limit the amount of arguments to 10 search parameters. I deliberately did not use functions (well, except for CONVERT) to keep the SQL as simple as possible to see if there is any way to accomplish this in LINQ. Here's the SQL:

    declare @_SEARCH1 varchar(1000)
    select  @_SEARCH1 = 'John'
    declare @_SEARCH2 varchar(1000)
    select  @_SEARCH2 = '111'

    select  *
    from    CUSTOMER
    where
            FName + ' ' + LName + ' ' + Phone + ' ' + CONVERT(varchar, DOB, 101) + ' ' + Address like '%'+@_SEARCH1+'%'
    and     FName + ' ' + LName + ' ' + Phone + ' ' + CONVERT(varchar, DOB, 101) + ' ' + Address like '%'+@_SEARCH2+'%'

So the question is, is there a way to write a LINQ that will generate this simple SQL? (please note that the comparison is done in the database via 'LIKE', not in the application)

UPDATE 2: Although solutions like the one from Francisco will generate the "LIKE" statement it will fail doing the comparison. The other solutions that pulls all the data from the table to the webserver will do the match correctly, but is totally impractical.

Accepted answer to RUNE FS since is the cleanest solution and will work will any number of fields.

like image 937
Nick Avatar asked Aug 17 '11 02:08

Nick


4 Answers

Using the PredicateBuilder

void Main()
{
    var search = new string[] { "Romania","RO"};
    var query = from c in countries.AllAny(search)
        orderby c.name
        select c;
    query.Dump();
}

public static class QueryExtensions
{
    public static IQueryable<T> AllAny<T>(this IQueryable<T> query, string[] search)    
    {           
        var properties = typeof(T).GetProperties().Where(p => p.GetCustomAttributes(typeof(System.Data.Linq.Mapping.ColumnAttribute),true).Any()).Select(n=>n.Name);        
        var andPredicate = PredicateBuilder.True<T>();
        foreach ( var term in search )
        {
            var orPredicate = PredicateBuilder.False<T>();
            foreach (var property in properties )
                orPredicate = orPredicate.Or(CreateLike<T>(property,term));
            andPredicate = andPredicate.And(orPredicate);
        }
        return query.Where(andPredicate);
    }
    private static Expression<Func<T,bool>> CreateLike<T>( PropertyInfo prop, string value)
    {       
        var parameter = Expression.Parameter(typeof(T), "f");
        var propertyAccess = Expression.MakeMemberAccess(parameter, prop);            
        var toString = Expression.Call(propertyAccess, "ToString", null, null);
        var like = Expression.Call(toString, "Contains", null, Expression.Constant(value,typeof(string)));

        return Expression.Lambda<Func<T, bool>>(like, parameter);       
    }

    private static Expression<Func<T,bool>> CreateLike<T>( string propertyName, string value)
    {
        var prop = typeof(T).GetProperty(propertyName);     
        return CreateLike<T>(prop, value);
    }

}

// http://www.albahari.com/nutshell/predicatebuilder.aspx
public static class PredicateBuilder
{
  public static Expression<Func<T, bool>> True<T> ()  { return f => true;  }
  public static Expression<Func<T, bool>> False<T> () { return f => false; }

  public static Expression<Func<T, bool>> Or<T> (this Expression<Func<T, bool>> expr1,
                                                      Expression<Func<T, bool>> expr2)
  {
    var invokedExpr = Expression.Invoke (expr2, expr1.Parameters.Cast<Expression> ());
    return Expression.Lambda<Func<T, bool>>
          (Expression.OrElse (expr1.Body, invokedExpr), expr1.Parameters);
  }

  public static Expression<Func<T, bool>> And<T> (this Expression<Func<T, bool>> expr1,
                                                       Expression<Func<T, bool>> expr2)
  {
    var invokedExpr = Expression.Invoke (expr2, expr1.Parameters.Cast<Expression> ());
    return Expression.Lambda<Func<T, bool>>
          (Expression.AndAlso (expr1.Body, invokedExpr), expr1.Parameters);
  }
}

Update This code is a generic solution for the following query

from c in countries
where (c.name.ToString().Contains(search[0]) || c.name.ToString().Contains(search[1]))
    && (c.iso_code.ToString().Contains(search[0]) || c.iso_code.ToString().Contains(search[1]))
    /*&& ...*/
orderby c.name
select c

This code can be improved in many ways. For sample, for the string properties, there is no need to call ToString before Contains ( this will generate a convert(nvarchar)) and I really think someone who needs this will only want to look at the varchar, nvarchar columns.

like image 127
Adrian Iftode Avatar answered Nov 03 '22 18:11

Adrian Iftode


Assuming that '\t' would never be part of the data you could do the following. You can of course substitute with any other character. With that assumption you could do as below:

public static IEnumerable<T> Where<T>(this IEnumerable<T> sequence, 
                                      string[] criteria){

var properties = typeof(T).GetProperties()
                          .Where(p=>p.GetGetMethod() != null);
return from s in sequence
       let text = properties.Aggregate("",(acc,prop) => 
                                               acc + 
                                               "\t" + 
                                               prop.GetValue(s,null)
                                       )
       where criteria.All(c => text.Contains(c))
       select s;

}

EDIT I originally didn't include the usage since I found no collection in the original post but assuming the sequence is defined as IEnumerabl<Person> and can be accessed as a property called Persons on a variable db. the code would look similar to:

IEnumerable<Person> persons = db.Persons.Where(criteria);
like image 31
Rune FS Avatar answered Nov 03 '22 17:11

Rune FS


I don't think linq to sql can do this efficiently, but linq to objects might be able to help if you can stomach moving the entire table over the wire between your application code and the database exactly one time per search.

The first step is to get a DataReader that will read through all of the records in your table. It's important to use a datareader, because you probably don't want to fill up memory with an entire table (and then again, maybe you do -- see my note about caching near the end).

Once you have that, you need to convert it to an IEnumerable for use with linq-to-objects. You can do that with a simple two-line utility method, like so:

IEnumerable<IDataRecord> EnumerableFromDataReader(IDataReader reader)
{
    while (reader.Read())
        yield return reader;
}

The actual code I normally use is a little more complicated, but what I do might not fit in well with how the rest of your project is structured so I'll leave it at this short method for now.

Once we have the Enumerable, we take advantage of the composable nature of linq to get some (relatively) simple (or at least brief) code like this:

IEnumerable<IDataRecord> SearchMyTable(string[] filters)
{
    var results = EnumerableFromDataReader(GetMyTable());

    foreach (string filter in filters)
    {
       results = results.Where(r => String.Join("",r.GetValues().Cast<string>().ToArray()).Contains(filter));
    }

    return results;
}

It's not a single linq query, but it's not exactly 50 lines of code either.

Ignoring network latency, the performance of this code is actually quite good. If you want to play with some caching for all or part of the table after the columns are joined, the performance is amazing.

Update: There is (at least) one flaw in this code. For each filter, I re-convert all the rows that survive to that filter to a string... even if I've already done so for a previous filter. The way to fix this is to first do a projection to pair the rows with the string version. But as it's after 11pm my time, I'll leave the code as it stands for now. The good news is the the final fixed code should be the same length: just add a .Select() call to the first and last lines and change up the middle of the foreach loop a bit.

like image 1
Joel Coehoorn Avatar answered Nov 03 '22 18:11

Joel Coehoorn


I don't have a C# compiler at hand right now but I have this idea:

Declare a lambda Expression like this:

public Expression<Func<Customer, bool>> GetFilterFromString(string input)
{
    return p=> p.FName.Contains(input) ||
               p.LName.Contains(input) ||
               p.Phone.Contains(input) ||
               p.DOB.ToString().Contains(input) ||
               p.Address.Contains(input) ||    
}

Implementation may vary depending on your needs (like concatenating all fields, like you did with your SQL query).

Then in your main query function:

IQueryable<Customer> customers = dataContext.Customers;
foreach(string inputSearch in search)
{
    customers = customers.Where(GetFilterFromString(inputSearch));
}
IEnumerable<Customer> results = customers.AsEnumerable();

I think the main advantage of this approach is that you have to declare GetFilterFromString once. Hope it is what you're looking for.

Edit:

Ok, so I read the SQL statement you were looking for (kind of late... but anyway). I think it's easy to adapt my solution. We will have to tweak the lambda expression a little:

public Expression<Func<Customer, bool>> GetFilterFromString(string input)
    {
        return p => (p.FName + " " +
                   p.LName + " " +                  
                   p.Phone + " " +
                   p.DOB.ToString() + " " +
                   p.Address)
                   .Contains(input) 
    }
like image 1
Francisco Avatar answered Nov 03 '22 18:11

Francisco