Using Lucene, one can retrieve the terms contained within in an index, i.e. the unique, stemmed words, excluding stop-words, that documents in the index contain. This is useful for generating autocomplete suggestions amongst other things. Is something similar possible with MS SQL Server full text indices?
Select columns name and language types for columns. You can only select character based and image based columns. Select change tracking. Now select the full-text catalog for index.
To get to this screen, go to the database and then go to Storage > Full Text Catalogs, find the full text catalog you want to work with and right click on it and select Properties.
A full-text index is a special type of index that provides index access for full-text queries against character or binary column data. A full-text index breaks the column into tokens and these tokens make up the index data.
You can use the new system view in SQL Server 2008 to get you the terms and count of occurrences, is this what you want?
sys.dm_fts_index_keywords_by_document
(
DB_ID('database_name'),
OBJECT_ID('table_name')
)
You need to supply the db_id
and object_id
of the fulltext table. This is the MSDN link for this: sys.dm_fts_index_keywords_by_document.
I agree that this information (words in the index, stemmed words, etc.) is usefull - and if SQL Server is serious about offering a serch platform, this information needs to be exposed. It's really not available in previous versions, as far as I can tell. However, the game changes in SQL Server 2008.
SQL Server 2008 offers new dynamic management views that offer this metadata for full text. Pay particular note to sys.dm_fts_parser and sys.dm_fts_index_keywords.
The sys.dm_fts_parser view takes in a phrase, along with a couple of other parameters and outputs a table showing a row set, showing stemmed versions of the individual words after the word breaker has deemed them as separate words.
MSDN gives the example of this query against the view:
SELECT * FROM sys.dm_fts_parser (' "The Microsoft business analysis" ', 1033, 0, 0)
To get the keywords, you can use sys.dm_fts_index_keywords.
I hope that points you in the right direction. Cheers.
With the following you can take the table name and column name as variables. It will work on the current database that it is run, omit end of file results and order the results descending by number of occurrences.
Note that full text indexing must be enabled on that table and column
DECLARE @TableName NVARCHAR(200) = 'MyTable'
DECLARE @ColumnName NVARCHAR(200) = 'MyColumn'
SELECT DISTINCT(display_term), occurrence_count FROM sys.dm_fts_index_keywords_by_document
(
DB_ID(),
OBJECT_ID(@TableName)
)
WHERE column_id =
(
SELECT TOP 1 col.column_id FROM sys.objects obj
INNER JOIN sys.columns col ON col.object_id = obj.object_id
WHERE obj.name = @TableName
AND col.name = @ColumnName
)
AND keyword != 0xFF
ORDER BY occurrence_count DESC
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With