Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Concerns about SQL Server 2008 Full Text Search

I have built a T-SQL query like this:

DECLARE @search nvarchar(1000) = 'FORMSOF(INFLECTIONAL,hills) AND FORMSOF(INFLECTIONAL,print) AND FORMSOF(INFLECTIONAL,emergency)'

SELECT * FROM Tickets
WHERE ID IN (
                -- unioned subqueries using CONTAINSTABLE
                            ...
            )

The GUI for this search will be an aspx page with a single textbox where the user can search.

I plan to somehow construct the search term to be like the example above (@search).

I have some concerns, though:

  • Is the example search term above the best or only way to include the inflections of all words in the search?
  • Should I separate the words and construct the search term in C# or T-SQL. I tend to lean toward C# for decisions/looping/construction, but I want your opinion.
  • I hate building SQL dynamically because of the risk of injection. How can I guard against this?
  • Should I use FREETEXTTABLE instead? Is there a way to make FREETEXT look for ALL words instead of ANY?
  • In general, how else would you do this?
like image 986
Ronnie Overby Avatar asked Aug 31 '09 17:08

Ronnie Overby


People also ask

Does SQL Server support full text search?

Full-Text Search in SQL Server and Azure SQL Database lets users and applications run full-text queries against character-based data in SQL Server tables.

How do I optimize full text search in SQL Server?

Tune the performance of full-text indexesMake sure that the base table has a clustered index. Use an integer data type for the first column of the clustered index. Avoid using GUIDs in the first column of the clustered index. A multi-range population on a clustered index can produce the highest population speed.

What is the use of full text search in SQL Server?

Full-text search refers to the functionality in SQL Server that supports full-text queries against character-based data. These types of queries can include words and phrases as well as multiple forms of a word or phrase.

What is advantage of fulltext over like for performing text search in MySQL?

Using the LIKE operator gives you 100% precision with no concessions for recall. A full text search facility gives you a lot of flexibility to tune down the precision for better recall. Most full text search implementations use an "inverted index".


2 Answers

I recently used Full-Text Search, so I'll try to answer some of your questions.

• "I hate building sql dynamically because of the risk of injection. How can I guard against this?"

I used a sanitize method like this:

static string SanitizeInput(string searchPhrase)
    {
        if (searchPhrase.Length > 200)
            searchPhrase = searchPhrase.Substring(0, 200);

        searchPhrase = searchPhrase.Replace(";", " ");
        searchPhrase = searchPhrase.Replace("'", " ");
        searchPhrase = searchPhrase.Replace("--", " ");
        searchPhrase = searchPhrase.Replace("/*", " ");
        searchPhrase = searchPhrase.Replace("*/", " ");
        searchPhrase = searchPhrase.Replace("xp_", " ");

        return searchPhrase;
    }

• Should I use FREETEXTTABLE instead? Is there a way to make FREETEXT look for ALL words instead of ANY?

I did use FREETEXTTABLE, but I needed any of the words. As much as I've read about it (and I've read quite a bit), you have to use CONTAINSTABLE to search for ALL words, or different combinations. FREETEXTTABLE seems to be the lighter solution, but not the one to pick when you want deeper customizations.

like image 176
Dan Dumitru Avatar answered Oct 06 '22 00:10

Dan Dumitru


Dan, I like your SanitizeInput method. I refactored it to make it more compact and enhance performance a little.

 static string SanitizeInput(string searchPhrase, int maxLength)
        {
            Regex r = new Regex(@";|'|--|xp_|/\*|\*/", RegexOptions.Compiled);
            return r.Replace(searchPhrase.Substring(0, searchPhrase.Length > maxLength ? maxLength : searchPhrase.Length), " ");
        }

        static string SanitizeInput(string searchPhrase)
        {
            const int MAX_SEARCH_PHRASE_LENGTH = 200;
            return SanitizeInput(searchPhrase, MAX_SEARCH_PHRASE_LENGTH);
        }

I agree that FreeTextTable is too lightweight of a solution.

like image 38
Anthony Gatlin Avatar answered Oct 06 '22 00:10

Anthony Gatlin