Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to implement Full-Text search in multilingual content in SQL Server

We have a site which supports different languages. We have millions of data so in search we would like to implement SQL Server Full-Text Search.

The table structure we have currently like below.

CREATE TABLE Product
(
   ID INT IDENTITY(1,1),
   Code VARCHAR(50),
   ........
   ........
)

CREATE TABLE ProductLanguage
(
   ID INT,
   LanguageID INT,
   Name NVARCHAR(200),
   ........
   ........
)

We would like to implement Full-Text search in "Name" column so we have created Full-Text index on the Name column. But while creating Full-Text index we can select only one language per column. If we select "English" or "Neutral" its not returning expected data in other languages like Japanese, Chinese, French etc.

So what is the best way to implement Full-Text search in SQL Server for multilingual content.

Do we need to create a different table. If yes then what will be the table structure (We need to keep in mind that the Languages are not fixed, different language can be added later) and what will be search query?

We are using SQL Server 2008 R2.

like image 956
Debajit Mukhopadhyay Avatar asked Jun 19 '13 06:06

Debajit Mukhopadhyay


1 Answers

Certain content (document) types support language settings - e.g. Microsoft Office Documents, PDF, [X]HTML, or XML.

If you change the type of your Name column to XML, you can determine the language of each value (i.e. per row). For instance:

Instead of storing values as strings

name 1
name 2
name 3

...you could store them as XML documents with the appropriate language declarations:

<content xml:lang="en-US">name 1</content>
<content xml:lang="fr-FR">name 2</content>
<content xml:lang="en-UK">name 3</content>

During Full-text index population the correct word breaker/stemmer will be used, based on the language settings of each value (XML document): US English for name 1, French or name 2, and UK English for name 3.

Of course, this would require a significant change in the way your data is managed and consumed.

ML

like image 184
Matija Lah Avatar answered Nov 03 '22 14:11

Matija Lah