Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

App Engine Search API (Document Search) - Multiple Languages

I have Documents that I'd like to make searchable in 3 different languages. Since I can have multiple fields with the same name/type, the following Document structure works (this is a simplified example).

document = search.Document(
    fields=[
      search.TextField(
        name="name",
        language="en",
        value="dog"),
      search.TextField(
        name="name",
        language="es",
        value="perro"),
      search.TextField(
        name="name",
        language="fr",
        value="chien")
    ]
  )
  index = search.Index("my_index")
  index.put(document)

Specifying the language helps Google tokenize the value of the TextField.

The following queries all work, each returning one result:

print index.search("name: dog")
print index.search("name: perro")
print index.search("name: chien")

Here is my question: Can I restrict a search to only target fields with a specific language?

The purpose is to avoid getting false positive results. Since each language uses the Arabic alphabet, it's possible that someone performing a full text search in Spanish may see English results that are not relevant.

Thank you.

like image 620
Aaron Drenberg Avatar asked Jun 22 '17 06:06

Aaron Drenberg


1 Answers

You could use a separate index for each language.

Define a utility function for resolving the correct index for a given language:

def get_index(lang):
   return search.Index("my_index_{}".format(lang))

Insert documents:

document = search.Document(
    fields=[
      search.TextField(
        name="name",
        language="en",
        value="dog"),
    ])

get_index('en').put(document)

document = search.Document(
    fields=[
      search.TextField(
        name="name",
        language="fr",
        value="chien")
    ])

get_index('fr').put(document)

Query by language:

query = search.Query(
    'name: chien')

results = get_index('fr').search(query)

for doc in results:
    print doc
like image 108
Frank Wilson Avatar answered Nov 08 '22 22:11

Frank Wilson