Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Full Text Search of URL field sql server

Objective: Return all URLs beginning with "https://mywebsite.domain.com/as/product/4/"

Given:

  • Applied full text search on URL field.
  • SQL Server edition: 2014.
  • 20+ Million rows

URL

https://mywebsite.domain.com/as/product/1/production
https://mywebsite.domain.com/as/product/2/items
https://mywebsite.domain.com/as/product/1/affordability
https://mywebsite.domain.com/as/product/3/summary
https://mywebsite.domain.com/as/product/4/schedule
https://mywebsite.domain.com/as/product/4/resources/summary

Query 1:

WHERE CONTAINS (URL, 'https://mywebsite.domain.com/as/product/4')

Result:

All records returned

Query 2 (Added "*" after reading MSDN article)

WHERE CONTAINS (URL, '"https://mywebsite.domain.com/as/product/4*"')

Result:

No records returned

Any assistance would be greatly appreciated.

like image 432
WorkSmarter Avatar asked Nov 18 '16 00:11

WorkSmarter


2 Answers

You can use CONTAINS with a LIKE subquery for matching only a start:

SELECT * 
FROM (
SELECT * 
FROM myTable WHERE CONTAINS (URL, '"https://mywebsite.domain.com/as/product/4/"')
) AS S1 
WHERE S1.URL LIKE 'https://mywebsite.domain.com/as/product/4/%' 

This way, the SLOW LIKE operator query will be run against a smaller set of records

EDIT1: (if WHERE CONTAINS (URL, '"https://mywebsite.domain.com/as/product/4/"') is not filtering Values)

After a lot of searches. the problem is in / . The forward-slash isn't contained in the Noise Words file, but I guess it's classed as a delimiter or Word breaker and therefore isn't searchable.

Read these Topics:

  • experts-exchange Topic
  • Stackoverflow Topic
  • Google Groups

EDIT2:

I found one suggested solution that is

/ is considered as an english wordbreaker You may change It from Registry

  • Navigate to Registry values HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\<InstanceRoot>\MSSearch\Language\eng and HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\<InstanceRoot>\MSSearch\Language\enu
  • clear value for WBreakerClass.

Sql server consider https://mywebsite.domain.com/as/product/4 as one word.

Note: above both path i have taken by assuming that you are using English language as word breaker.

Read more about Word Breaker in This MSDN Topic

like image 172
Hadi Avatar answered Oct 14 '22 14:10

Hadi


Use the Like operator:

WHERE URL LIKE 'https://mywebsite.domain.com/as/product/4%'

The % is a wildcard. This should return all records that start with a pattern match up to the first wildcard %.

like image 44
Rafael Avatar answered Oct 14 '22 14:10

Rafael