I have a table with HTML data, that I want to search using the Full Text Index via an html-filter
So I created an index:
CREATE FULLTEXT CATALOG myCatalog AS DEFAULT
CREATE FULLTEXT INDEX ON myTable (Body TYPE COLUMN Filetype)
KEY INDEX PK_myTable
Body
is a varbinary(max)
column with HTML. The Filetype
column is a computed column returns .html
.
No results are being returned.
I verified that .html
filter is installed. FullText index is also installed properly and works fine if I convert the column to nvarchar
and create just a "plain text" index (not html).
No errors in the SQL log or FTS log.
The keywords table is just empty!
SELECT *
FROM sys.dm_fts_index_keywords
(DB_ID('myDatabase'), OBJECT_ID('myTable'))
All it returns is "END OF FILE" symbol.
It says "document count 35" which mean the documents were processed, but no keywords were extracted.
PS. I have SQL Server Express Edition 2012 (with all advanced features including full text). Can this be the reason? But again, the "plain" full text search works just fine!
PPS. Asked my coworker to test this on SQL Express 2016 - same result... Tried on our production server "Enterprise" edition - same.
OK it turns out the full text index DOES NOT SUPPORT UNICODE!! in varbinary columns. When I converted the column to non-unicode (by converting it to nvarchar
then to varchar
and then back to varbinary
) It started working.
Anyone knows any workarounds?
OK, so it turns out fulltext index DOES support unicode data in varbinary
but pay attention to this:
If your varbinary
column is created from Nvarchar
be sure to include the 0xFFFE
unicode signature at the beginning
For example, I'm using a computed column for full text index, so I had to change my computed column to this:
alter table myTable
add FTS_Body as 0xFFFE + (CAST(HtmlBody as VARBINARY(MAX)))
--HtmlBody is my nvarchar column that contains html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With