Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch term query does not give any results

I am very new to Elasticsearch and I have to perform the following query:

GET book-lists/book-list/_search
{  
   "query":{  
      "filtered":{  
         "filter":{  
            "bool":{  
               "must":[  
                  {  
                     "term":{  
                        "title":"Sociology"
                     }
                  },
                  {  
                     "term":{  
                        "idOwner":"17xxxxxxxxxxxx45"
                     }
                  }
               ]
            }
         }
      }
   }
}

According to the Elasticsearch API, it is equivalent to pseudo-SQL:

SELECT document
FROM   book-lists
WHERE  title = "Sociology"
       AND idOwner = 17xxxxxxxxxxxx45

The problem is that my document looks like this:

{  
   "_index":"book-lists",
   "_type":"book-list",
   "_id":"AVBRSvHIXb7carZwcePS",
   "_version":1,
   "_score":1,
   "_source":{  
      "title":"Sociology",
      "books":[  
         {  
            "title":"The Tipping Point: How Little Things Can Make a Big Difference",
            "isRead":true,
            "summary":"lorem ipsum",
            "rating":3.5
         }
      ],
      "numberViews":0,
      "idOwner":"17xxxxxxxxxxxx45"
   }
}

And the Elasticsearch query above doesn't return anything.

Whereas, this query returns the document above:

GET book-lists/book-list/_search
{  
   "query":{  
      "filtered":{  
         "filter":{  
            "bool":{  
               "must":[  
                  {  
                     "term":{  
                        "numberViews":"0"
                     }
                  },
                  {  
                     "term":{  
                        "idOwner":"17xxxxxxxxxxxx45"
                     }
                  }
               ]
            }
         }
      }
   }
}

This makes me suspect that the fact that "title" is the same name for the two fields is for something.

Is there a way to fix this without having to rename any of the fields. Or am I missing it somewhere else?

Thanks for anyone trying to help.

like image 375
Mayas Avatar asked Oct 10 '15 11:10

Mayas


People also ask

What is term in Elasticsearch query?

Term queryedit. Returns documents that contain an exact term in a provided field. You can use the term query to find documents based on a precise value such as a price, a product ID, or a username. Avoid using the term query for text fields.

What is the difference between match and term query in Elasticsearch?

To better search text fields, the match query also analyzes your provided search term before performing a search. This means the match query can search text fields for analyzed tokens rather than an exact term. The term query does not analyze the search term. The term query only searches for the exact term you provide.

How do I retrieve data from Elasticsearch?

You can use the search API to search and aggregate data stored in Elasticsearch data streams or indices. The API's query request body parameter accepts queries written in Query DSL. The following request searches my-index-000001 using a match query. This query matches documents with a user.id value of kimchy .


1 Answers

Your problem is described in the documentation.

I suspect that you don't have any explicit mapping on your index, which means elasticsearch will use dynamic mapping.

For string fields, it will pass the string through the standard analyzer which lowercases it (among other things). This is why your query doesn't work.

Your options are:

  1. Specify an explicit mapping on the field so that it isn't analyzed before storing in the index (index: not_analyzed).
  2. Clean your term query before sending it to elasticsearch (in this specific query lowercasing will work, but note that the standard analyzer also does other things like remove stop words, so depending on the title you may still have issues).
  3. Use a different query type (e.g., query_string instead of term which will analyze the query before running it).

Looking at the sort of data you are storing you probably need to specify an explicit not_analyzed mapping.

For option three your query would look something like this:

{  
   "query":{  
      "filtered":{  
         "filter":{  
            "bool":{  
               "must":[  
                  {  
                     "query_string":{  
                        "fields": ["title"],
                        "analyzer": "standard",
                        "query": "Sociology"
                     }
                  },
                  {  
                     "term":{  
                        "idOwner":"17xxxxxxxxxxxx45"
                     }
                  }
               ]
            }
         }
      }
   }
}

Note that the query_string query has special syntax (e.g., OR and AND are not treated as literals) which means you have to be careful what you give it. For this reason explicit mapping with a term filter is probably more appropriate for your use case.

like image 77
solarissmoke Avatar answered Sep 28 '22 12:09

solarissmoke