have problem with ampersand (&)
How to search for the words (or sentences) that contain an ampersand (&).
For example, in the database are:
1: "Johnson & Johnson"
2: "AT&T"
3: "Sample & Sample"
How should I write a full text search query to search for individual records?
SELECT * from Companies c WHERE CONTAINS(c.CompanyName, '"AT&T"')
I know that character (&) is responsible for the logical AND operation. But I do not know how to encode it to search in text with use full text search.
Any idea?
Use an ampersand (&) to identify each variable in your SQL statement. You do not need to define the value of each variable. Toggling the display of the text of a command before and after SQL*Plus replaces substitution variabfes with values.
Full-Text Search in SQL Server and Azure SQL Database lets users and applications run full-text queries against character-based data in SQL Server tables.
If you set escape on, it uses an esape using the backslash. So, backslash ampersand will show you an ampersand. Hope this helps.
How can I tell if Full-Text Search is enabled on my SQL Server instance? A: You can determine if Full-Text Search is installed by querying the FULLTEXTSERVICEPROPERTY like you can see in the following query. If the query returns 1 then Full-Text Search is enabled.
Short version: You can't (or at least you can, but you may get more results than you expected)
Long version: The character '&'
is treated as a "word breaker", i.e. when SQL Server encounters an '&'
it treats it as the start of a new "word" (i.e. token). What SQL Server Sees when parsing "AT&T"
is two tokens, "AT"
and "T"
.
You can check this for yourself using sys.dm_fts_parser
:
SELECT * FROM sys.dm_fts_parser('AT&T', 1033, 0, 0)
keyword group_id phrase_id occurrence special_term display_term expansion_type source_term
----------- ----------- ----------- ----------- ------------- ------------- -------------- -----------
0x00610074 1 0 1 Noise Word at 0 AT
0x0074 2 0 1 Noise Word t 0 T
This means that search for "AT&T"
is pretty much exactly the same as just searching for "AT T"
.
This is by design, as far as I can see the only way to modify this behaviour would be to install your own word breaker, however this isn't something that I would recommend doing.
The accepted answer isn't entirely correct. Enclosing the search term in double-quotes makes the grouping of words a "phrase" match. In this case, the ampsersand ( &
) can be treated as a literal character, such as when surrounded by one or more letters that do not form a known word. Just looking at your "AT&T"
example, we see:
DECLARE @Term NVARCHAR(100);
SET @Term = N'"AT&T"';
SELECT * FROM sys.dm_fts_parser(@Term, 1033, 0, 0);
SELECT * FROM sys.dm_fts_parser(@Term, 1033, 0, 1);
SELECT * FROM sys.dm_fts_parser(@Term, 1033, NULL, 0);
SELECT * FROM sys.dm_fts_parser(@Term, 1033, NULL, 1);
GO
Returns:
keyword group phrase occurrence special display expansion source
id id term term type term
0x0061007400260074 1 0 1 Exact Match at&t 0 AT&T
As you can see, the ampersand presents no problem at all, as long as it is enclosed in double-quotes ( "
) which you are already doing, woo hoo!
However, that doesn't work as cleanly for the "Johnson & Johnson"
example:
DECLARE @Term NVARCHAR(100);
SET @Term = N'"Johnson & Johnson"';
SELECT * FROM sys.dm_fts_parser(@Term, 1033, 0, 0);
SELECT * FROM sys.dm_fts_parser(@Term, 1033, 0, 1);
SELECT * FROM sys.dm_fts_parser(@Term, 1033, NULL, 0);
SELECT * FROM sys.dm_fts_parser(@Term, 1033, NULL, 1);
GO
Returns:
keyword group phrase occurrence special display expansion source
id id term term type term
0x006A006F0068006E0073006F006E 1 0 1 Exact Match johnson 0 Johnson & Johnson
0x006A006F0068006E0073006F006E 1 0 2 Exact Match johnson 0 Johnson & Johnson
That would seem to also match a search term of Johnson Johnson
, which isn't technically correct.
So, in addition to enclosing in double-quotes, you can also convert the ampersand to be an underscore ( _
) which is handled differently:
DECLARE @Term NVARCHAR(100);
SET @Term = N'"Johnson _ Johnson"';
SELECT * FROM sys.dm_fts_parser(@Term, 1033, 0, 0);
SELECT * FROM sys.dm_fts_parser(@Term, 1033, 0, 1);
SELECT * FROM sys.dm_fts_parser(@Term, 1033, NULL, 0);
SELECT * FROM sys.dm_fts_parser(@Term, 1033, NULL, 1);
GO
Returns:
keyword group phrase occurrence special display expansion source
id id term term type term
0x006A006F0068006E0073006F006E 1 0 1 Exact Match johnson 0 Johnson _ Johnson
0x005F 1 0 2 Exact Match _ 0 Johnson _ Johnson
0x006A006F0068006E0073006F006E 1 0 3 Exact Match johnson 0 Johnson _ Johnson
AND, doing that one character translation does not seem to adversely affect the original "AT&T"
search:
DECLARE @Term NVARCHAR(100);
SET @Term = N'"AT_T"';
SELECT * FROM sys.dm_fts_parser(@Term, 1033, 0, 0);
SELECT * FROM sys.dm_fts_parser(@Term, 1033, 0, 1);
SELECT * FROM sys.dm_fts_parser(@Term, 1033, NULL, 0);
SELECT * FROM sys.dm_fts_parser(@Term, 1033, NULL, 1);
Returns:
keyword group phrase occurrence special display expansion source
id id term term type term
0x00610074005F0074 1 0 1 Exact Match at_t 0 AT_T
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With