Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Azure CONTAINS not returning all results

We added a free text search on the following table:

| 1 | kayer-meyar | 
| 2 | KA-ME |

But,

select * 
from Names 
where CONTAINS(name, '"ME*"')

returns only:

| 1 | kayer-meyar |

While,

select * 
from Names 
where CONTAINS(name, '"KA*"')

returns both:

| 1 | kayer-meyar |
| 2 | KA-ME |

when we run:

select * 
from sys.dm_fts_parser('"KA-ME"', 1033, NULL, 0)

returns:

ka-me
ka
me
like image 803
Guy Korland Avatar asked Jun 28 '16 11:06

Guy Korland


2 Answers

After searching and tuning your problem i have found two major fault in full-text searching:

  1. The hyphen might be treated as a word break. It return only | 1 | kayer-meyar | when i use '"ME*"'. it doesn't return | 2 | KA-ME |. The problem is because your condition only allow word start with (not end with or in a middle) ME + at least one character . You can say, "then how come it return | 1 | kayer-meyar | as string me is in the middle of this word ?". Well that it is because fulltext serach does not consider it as a silgle word, it consider it as two seperate word(something like kayer meyar ) thus it fullfill the requrement(me*). Again in the case of KA-ME it recognize as KA ME rather than a single word and it also fail the condition(though it star with ME but there is no extra character after that)
  2. Have you tried rebuilding your full-text index? .

Now the SOLUTION is: I have Turn off the Stop List for Full Text Search Query Use this query for this(my table name is MyTable):

ALTER FULLTEXT INDEX ON MyTable SET STOPLIST = OFF

then run your query. this time you will get your desire result. enter image description here

AND HERE IS MY FULL QUERY:

--CREATE TABLE MyTable
--(
--Id INT IDENTITY(1,1),
--Name varchar(max)  Not Null
--)

---- To see if FULLTEXT installed or not
--SELECT SERVERPROPERTY('IsFullTextInstalled')

---- http://stackoverflow.com/questions/2306825/why-cant-i-create-this-sql-server-full-text-index
---- https://technet.microsoft.com/en-us/library/ms187317.aspx
---- http://stackoverflow.com/questions/2306825/why-cant-i-create-this-sql-server-full-text-index
---- http://stackoverflow.com/questions/2315577/sql-server-2008-full-text-search-on-a-table-with-a-composite-primary-key

--CREATE UNIQUE INDEX ui_MyTable ON MyTable(Id); 
--select name from sysindexes where object_id('MyTable') = id;

--CREATE FULLTEXT CATALOG ft AS DEFAULT; 

--CREATE FULLTEXT INDEX ON MyTable(Name)   
--   KEY INDEX ui_MyTable  
--   WITH STOPLIST = SYSTEM;  
--GO  

--INSERT INTO MyTable(Name) VALUES('kayer-meyar'),('KA-ME');


ALTER FULLTEXT INDEX ON MyTable SET STOPLIST = OFF

select * 
from MyTable 
where CONTAINS(Name, '"ME*"')

select *
from MyTable 
where CONTAINS(Name, '"KA*"')
like image 144
RU Ahmed Avatar answered Sep 22 '22 08:09

RU Ahmed


The behavior you described is a consequence of the usage of system stopwords list is correct. This is an expected behavior. "Me" is a stop word, which exists in the system stopword list. System stopwords list used by default by data indexing process.

You can check yourself with this script:

select * from sys.dm_fts_parser('"KA-ME"', 1033, 0, 0)

The third parameter here is stopword list identifier. When you pass NULL, stopwords are not identified on parsing, and you see "ME" of type "Exact Match". When you pass 0 as the third parameter, system stopwords list used, and "ME" will be of type "Noise Word". This means that SQL Server will not save it into the FTS index for searching.

As Raihan mentioned, you can turn off system stopwords list, but as for me, turning off stopwords completely is a too big hammer, especially for Azure SQL Database, because you should pay for additional space (FTS indexes are stored in the same database in the internal tables). Creating a new (smaller) stopword list and use it for FTS may be a better solution.

like image 41
Denis Reznik Avatar answered Sep 21 '22 08:09

Denis Reznik