I have a table definition as below
CREATE TABLE [dbo].[Dialogs](
[ID] [int] IDENTITY(1,1) NOT NULL,
[DiscussionID] [int] NOT NULL,
[ApprovedByUserID] [int] NULL,
[AddedByUserID] [int] NULL,
[Text] [nvarchar](max) NULL,
[ApprovalStatus] [int] NULL,
[ApprovedOn] [datetime] NULL,
[AddedOn] [datetime] NOT NULL,
CONSTRAINT [PK_dbo.Dialogs] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
Text column contains html entered by user. i want to create full text index on Text column, i also need support for html filter so that when any user type <div>,<p> or any other html tag then it do not return any results.
To create index i use below SQL
CREATE FULLTEXT INDEX ON [Dialogs]
(
[Text] TYPE COLUMN '.html'
)
KEY INDEX [PK_dbo.Dialogs]
ON AOPRDefault;
but SQL Server throws error
Incorrect syntax near '.html'.
can any one please give a example of how to specify TYPE COLUMN option when creating full text index.
Firstly, the problem is that you should be referring to a column where you have '.html' rather than a literal, so you might have something like:
-- ADD COMPUTED COLUMN TO STORE FILE TYPE
ALTER TABLE dbo.Dialogs ADD FileExtension AS '.html';
CREATE FULLTEXT INDEX ON dbo.Dialogs ([Text] TYPE COLUMN FileExtension)
KEY INDEX [PK_dbo.Dialogs] ON AOPRDefault;
However you are misunderstanding the purpose of the TYPE COLUMN property, according to the documentation:
TYPE COLUMN type_column_name
Specifies the name of a table column, type_column_name, that is used to hold the document type for a varbinary(max) or image document. This column, known as the type column, contains a user-supplied file extension (.doc, .pdf, .xls, and so forth). The type column must be of type char, nchar, varchar, or nvarchar.
Specify TYPE COLUMN type_column_name only if column_name specifies a varbinary(max) or image column, in which data is stored as binary data; otherwise, SQL Server returns an error.
Note
At indexing time, the Full-Text Engine uses the abbreviation in the type column of each table row to identify which full-text search filter to use for the document in column_name. The filter loads the document as a binary stream, removes the formatting information, and sends the text from the document to the word-breaker component. For more information, see Configure and Manage Filters for Search
Since your index is on a text column this is not applicable and the create index statement would return an error. Even if you were storing the html document as binary data, then it would still not work as you intended, what you are after is html parsing, which is a separate issue from full text indexing.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With