Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Strange behaviour with Fulltext search in SQL Server

I have MyTable with a Column Message NVARCHAR(MAX).

Record with ID 1 contains the Message '0123456789333444 Test'

When I run the following query

DECLARE @Keyword NVARCHAR(100)

SET @Keyword = '0123456789000001*'

SELECT *
FROM MyTable
WHERE CONTAINS(Message, @Keyword) 

Record ID 1 is showing up in the results and in my opinion it should not because 0123456789333444 does not contains 0123456789000001.

Can someone explain why the records is showing up anyway?

EDIT

select * from sys.dm_fts_parser('"0123456789333444 Test"',1033,0,0)

returns the following:

group_id phrase_id occurrence special_term  display_term        expansion_type source_term
1        0         1           Exact Match  0123456789333444    0              0123456789333444 Test
1        0         1           Exact Match  nn0123456789333444  0              0123456789333444 Test
1        0         2           Exact Match  test                0              0123456789333444 Test
like image 577
gsharp Avatar asked Nov 11 '13 16:11

gsharp


3 Answers

This is because the @Keyword is not wrapped in double quotes. Which forces zero, one, or more matches.

Specifies a match of words or phrases beginning with the specified text. Enclose a prefix term in double quotation marks ("") and add an asterisk () before the ending quotation mark, so that all text starting with the simple term specified before the asterisk is matched. The clause should be specified this way: CONTAINS (column, '"text"'). The asterisk matches zero, one, or more characters (of the root word or words in the word or phrase). If the text and asterisk are not delimited by double quotation marks, so the predicate reads CONTAINS (column, 'text*'), full-text search considers the asterisk as a character and searches for exact matches to text*. The full-text engine will not find words with the asterisk (*) character because word breakers typically ignore such characters.

When is a phrase, each word contained in the phrase is considered to be a separate prefix. Therefore, a query specifying a prefix term of "local wine*" matches any rows with the text of "local winery", "locally wined and dined", and so on.

Have a look at the MSDN on the topic. MSDN

like image 115
Mathew A. Avatar answered Oct 24 '22 16:10

Mathew A.


Have you tried to query the following view to see what's on the system stoplist?

select * from sys.fulltext_system_stopwords where language_id = 1033;
like image 20
NickyvV Avatar answered Oct 24 '22 16:10

NickyvV


Found a solution that works. I've added language 1033 as an additional parameter.

SELECT * FROM MyTable WHERE CONTAINS(Message, @Keyword, langauge 1033) 
like image 30
gsharp Avatar answered Oct 24 '22 15:10

gsharp