Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Make SQL Server index small numbers

We're using SQL Server 2005 in a project. The users of the system have the ability to search some objects by using 'keywords'. The way we implement this is by creating a full-text catalog for the significant columns in each table that may contain these 'keywords' and then using CONTAINS to search for the keywords the user inputs in the search box in that index.

So, for example, let say you have the Movie object, and you want to let the user search for keywords in the title and body of the article, then we'd index both the Title and Plot column, and then do something like:

SELECT * FROM Movies WHERE CONTAINS(Title, keywords) OR CONTAINS(Plot, keywords)

(It's actually a bit more advanced than that, but nothing terribly complex)

Some users are adding numbers to their search, so for example they want to find 'Terminator 2'. The problem here is that, as far as I know, by default SQL Server won't index short words, thus doing a search like this:

SELECT * FROM Movies WHERE CONTAINS(Title, '"Terminator 2"')

is actually equivalent to doing this:

SELECT * FROM Movies WHERE CONTAINS(Title, '"Terminator"') <-- notice the missing '2'

and we are getting a plethora of spurious results.

Is there a way to force SQL Server to index small words? Preferably, I'd rather index only numbers like 1, 2, 21, etc. I don't know where to define the indexing criteria, or even if it's possible to be as specific as that.


Well, I did that, removed the "noise-words" from the list, and now the behaviour is a bit different, but still not what you'd expect.

A search won't for "Terminator 2" (I'm just making this up, my employer might not be really happy if I disclose what we are doing... anyway, the terms are a bit different but the principle the same), I don't get anything, but I know there are objects containing the two words.

Maybe I'm doing something wrong? I removed all numbers 1 ... 9 from my noise configuration for ENG, ENU and NEU (neutral), regenerated the indexes, and tried the search.

like image 214
dguaraglia Avatar asked Sep 16 '08 12:09

dguaraglia


People also ask

Why is it not recommended to create indexes on small tables?

Indexing small tables may not be optimal because it can take the query optimizer longer to traverse the index searching for data than to perform a simple table scan. Therefore, indexes on small tables might never be used, but must still be maintained as data in the table changes.

Can we truncate index?

"TRUNCATE TABLE removes all rows from a table, but the table structure and its columns, constraints, indexes and so on remain.

How do I SELECT the smallest number in SQL?

The MIN() function returns the smallest value of the selected column. The MAX() function returns the largest value of the selected column.

Can we alter index in SQL Server?

Using SQL Server Management StudioRight-click the index that you want to modify and then click Properties. In the Index Properties dialog box, make the desired changes. For example, you can add or remove a column from the index key, or change the setting of an index option.


1 Answers

These "small words" are considered "noise words" by the full text index. You can customize the list of noise words. This blog post provides more details. You need to repopulate your full text index when you change the noise words file.

like image 129
Darren Gosbell Avatar answered Nov 15 '22 06:11

Darren Gosbell