Elasticsearch

Question

Let me explain my question with an example. Let's say I have three different type of document with some common fields i.e book, song, magazin

Book has name, author, publisher, pageNumber etc.
Song has name, singer, publisher, length etc.
Magazin has name, company, publisher, pageNumber etc.

As you see name and publisher fields are common fields for all the three types of documents. pageNumber is feature of both Magazin and Book. And rest of the fields are independent from other types of documents.

I will store these data on same index. I can store these data either,

with a single type such as Object which has a category (Book, Song, Magazin) field in it. I'm giving mapping details when index first created. So, in this option book will have length field but it will be empty, since its not a Book feature.
or three types of documents on _type field.

My queries and facets will be on common fields. Which of the following approaches would have lesser query and facet times?

Is /index/book,song,magazin/ -d {myQuery} more efficient than /index/object/ -d {myQuery && (category = book || category = song || category = magazin)} ?

Thanks for the answers.

Alex Brasetvik · Accepted Answer

Elasticsearch's type concept does not exist in Lucene.

When indexing documents, the document's type gets indexed. Then, when searching on just certain types, Elasticsearch will implicitly add a filter on the indexed type to your query.

Thus, with your last approach, you would have your category-filter in addition to the implicit _type:object-filter. Essentially, you are not gaining anything by not using Elasticsearch's types here.

Elasticsearch - Efficiency of search across multiple types

Tags:

mapping

shyos

1 Answers

Alex Brasetvik

Recent Activity

Donate For Us