Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create more accurate searching?

Tags:

c#

linq

I need an accurate searching function whether in jquery or c#. If possible I want the searching as brilliant as google :-)

So here is c# code:

Brief explanation:
This searches all users in database that has complete information. It searches all users except the currently logged in user.

string[] ck = keyword.Split(new string[] { " ", ",", "." },
                            StringSplitOptions.RemoveEmptyEntries);

using (dbasecore db = ConfigAndResourceComponent.BaseCampContext())
{
    var results = (from u in db.users
                   join uinfo in db.userinfoes 
                        on u.UserID equals uinfo.UserID
                   where u.UserID != userid && 
                        (ck.Contains(u.LastName) || ck.Contains(u.FirstName) ||
                         ck.Contains(u.MiddleName) || ck.Contains(u.LoginID))
                   orderby u.LastName, u.FirstName, u.MiddleName ascending
                   select uinfo).Skip(skip).Take(take).ToList();

    return (from i in results select new UserInfo(i)).ToList();
}  

And the result:

enter image description here

The encircled name must be on the top of the search items since it matches more keyword.
Any idea?

like image 876
fiberOptics Avatar asked May 29 '13 12:05

fiberOptics


3 Answers

There are several ways to achieve what you want:

1) Write own ranking algorithm. That means, that you get results using Linq and then sort them using own ranking function - which may be something simple, like splitting request into words and counting those words appearance in each result or something complex, like using stemming to find different forms of request terms, measuring distance between terms, boosting some terms, etc. I would not recommend going this way - because Like queries are slow on SQL and you need to write something it is already written.

2) Use Sql Server Full Text Search: http://msdn.microsoft.com/en-us/library/ms142524(v=sql.105).aspx. Although I'm not the fan of using SQL Server Full Text Search, it is good and viable solution.

3) Use third party full text search, there are some alternatives, Lucene (http://www.codeproject.com/Articles/29755/Introducing-Lucene-Net) is probably the most used in .net. That gives you speed and flexibility, you can index your data in various ways, but surely enough you're responsive for indexing it. There's also API on top of Lucene, like Solr, which I love most - although it may be too much in your case.

like image 83
Giedrius Avatar answered Oct 17 '22 15:10

Giedrius


For simplicity I will use one table with user entity like this:

public class User
{
    public int Id { get; set; }
    public string FirstName { get; set; }
    public string LastName { get; set; }
    public string MiddleName { get; set; }
}

Here is query (works in EF) which calculates match value for each user, and then select only those which matched some keywords, ordering results by match value:

var keywords = new [] {"Sergey", "Berezovskiy"};

var users = from u in context.Users
            let match = (keywords.Contains(u.FirstName) ? 1 : 0) +
                        (keywords.Contains(u.LastName) ? 1 : 0) +
                        (keywords.Contains(u.MiddleName) ? 1 : 0)
            where match > 0
            orderby match descending, 
                    u.LastName, u.FirstName
            select u;

Range variable match will have values from 0 (if none fields matched keywords) to 3 (if all fields matched).

like image 1
Sergey Berezovskiy Avatar answered Oct 17 '22 15:10

Sergey Berezovskiy


As brilliant as Google may be a far cry, but you can achieve something acceptable using a very simple technique. Here's the idea:

In your WHERE clause, instead of doing WHERE ck.Contains(u.LastName) || ck.Contains(u.FirstName), you can add an expression that assigns a value to each successful criterion (according to its relative weight) and add them to get a final score. For example:

WHERE (ck.Contains(u.LastName)? 1 : 0) + (ck.Contains(u.FirstName)? 2 : 0) + ...

Not sure if LINQ supports ternary operator or not, but if it doesn't you can achieve the same using a loop and manual method too. The sum of all terms will give a higher score to the candidates that are a closer match. You can then sort by this column.

like image 1
dotNET Avatar answered Oct 17 '22 15:10

dotNET