Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch - Efficiency of search across multiple types

Let me explain my question with an example. Let's say I have three different type of document with some common fields i.e book, song, magazin

  • Book has name, author, publisher, pageNumber etc.
  • Song has name, singer, publisher, length etc.
  • Magazin has name, company, publisher, pageNumber etc.

As you see name and publisher fields are common fields for all the three types of documents. pageNumber is feature of both Magazin and Book. And rest of the fields are independent from other types of documents.

I will store these data on same index. I can store these data either,

  • with a single type such as Object which has a category (Book, Song, Magazin) field in it. I'm giving mapping details when index first created. So, in this option book will have length field but it will be empty, since its not a Book feature.

  • or three types of documents on _type field.

My queries and facets will be on common fields. Which of the following approaches would have lesser query and facet times?

Is /index/book,song,magazin/ -d {myQuery} more efficient than /index/object/ -d {myQuery && (category = book || category = song || category = magazin)} ?

Thanks for the answers.

like image 742
shyos Avatar asked Dec 30 '13 10:12

shyos


1 Answers

Elasticsearch's type concept does not exist in Lucene.

When indexing documents, the document's type gets indexed. Then, when searching on just certain types, Elasticsearch will implicitly add a filter on the indexed type to your query.

Thus, with your last approach, you would have your category-filter in addition to the implicit _type:object-filter. Essentially, you are not gaining anything by not using Elasticsearch's types here.

like image 142
Alex Brasetvik Avatar answered Nov 12 '22 12:11

Alex Brasetvik