Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use SQL Server FTS Stemmer

Is there any way to directly access the stemmer used in the FORMSOF() option of a CONTAINS Full Text Search query so that it returns the stems/inflections of an input word, not just those derivations that exist in a search column.

For example, the query

SELECT * FROM dbo.MyDB WHERE contains(CHAR_COL,'FORMSOF(INFLECTIONAL, prettier)')

returns the stem "pretty" and other inflections such as "prettiest" if they exists in the CHAR_COL column. What I want is to call the FORMSOF() function directly without referencing a column at all. Any chance?

EDIT: The query that met my needs ended up being

SELECT * FROM 
    (SELECT ROW_NUMBER() OVER (PARTITION BY group_ID ORDER BY GROUP_ID) ord, display_term
    from sys.dm_fts_parser('FORMSOF( FREETEXT, running) and FORMSOF(FREETEXT, jumping)', 1033, null, 1)) a
WHERE ord=1

Requires membership in the sysadmin fixed server role and access rights to the specified stoplist.

like image 463
Laramie Avatar asked Nov 11 '10 20:11

Laramie


1 Answers

No. You can not do this. You can't get an access to stemmer directly.

You can get an idea of how it works by looking into Solr source code. But it might (and I guess will) be different from the one implemented in MS SQL FT.

UPDATE: It turns out that in SQL Server 2008 R2 you can do something quite close to what you want. A special table-valued UDF was added:

 sys.dm_fts_parser('query_string', lcid, stoplist_id, accent_sensitivity)

it allows you to get a tokenization result (i.e. the result after applying word breaking, thesaurus and stop list application). So in case you feed it 'FORMSOF(....)' it will give you the result you want (well, you will have to process result set anyway). Here's corresponding article in MSDN.

like image 164
AlexS Avatar answered Sep 24 '22 20:09

AlexS