What is an index in Elasticsearch

Tags:

What is an index in Elasticsearch? Does one application have multiple indexes or just one?

Let's say you built a system for some car manufacturer. It deals with people, cars, spare parts, etc. Do you have one index named manufacturer, or do you have one index for people, one for cars and a third for spare parts? Could someone explain?

606

asked Feb 22 '13 13:02

LuckyLuke

1 Answers

Good question, and the answer is a lot more nuanced than one might expect. You can use indices for several different purposes.

Indices for Relations

The easiest and most familiar layout clones what you would expect from a relational database. You can (very roughly) think of an index like a database.

MySQL => Databases => Tables => Rows/Columns
ElasticSearch => Indices => Types => Documents with Properties

An ElasticSearch cluster can contain multiple Indices (databases), which in turn contain multiple Types (tables). These types hold multiple Documents (rows), and each document has Properties (columns).

So in your car manufacturing scenario, you may have a SubaruFactory index. Within this index, you have three different types:

People
Cars
Spare_Parts

Each type then contains documents that correspond to that type (e.g. a Subaru Imprezza doc lives inside of the Cars type. This doc contains all the details about that particular car).

Searching and querying takes the format of: http://localhost:9200/[index]/[type]/[operation]

So to retrieve the Subaru document, I may do this:

  $ curl -XGET localhost:9200/SubaruFactory/Cars/SubaruImprezza

Indices for Logging

Now, the reality is that Indices/Types are much more flexible than the Database/Table abstractions we are used to in RDBMs. They can be considered convenient data organization mechanisms, with added performance benefits depending on how you set up your data.

To demonstrate a radically different approach, a lot of people use ElasticSearch for logging. A standard format is to assign a new index for each day. Your list of indices may look like this:

logs-2013-02-22
logs-2013-02-21
logs-2013-02-20

ElasticSearch allows you to query multiple indices at the same time, so it isn't a problem to do:

  $ curl -XGET localhost:9200/logs-2013-02-22,logs-2013-02-21/Errors/_search=q:"Error Message"

Which searches the logs from the last two days at the same time. This format has advantages due to the nature of logs - most logs are never looked at and they are organized in a linear flow of time. Making an index per log is more logical and offers better performance for searching.

Indices for Users

Another radically different approach is to create an index per user. Imagine you have some social networking site, and each users has a large amount of random data. You can create a single index for each user. Your structure may look like:

Zach's Index
- Hobbies Type
- Friends Type
- Pictures Type
Fred's Index
- Hobbies Type
- Friends Type
- Pictures Type

Notice how this setup could easily be done in a traditional RDBM fashion (e.g. "Users" Index, with hobbies/friends/pictures as types). All users would then be thrown into a single, giant index.

Instead, it sometimes makes sense to split data apart for data organization and performance reasons. In this scenario, we are assuming each user has a lot of data, and we want them separate. ElasticSearch has no problem letting us create an index per user.

answered Oct 03 '22 23:10

Zach

Related questions
                            
                                Why doesn't Stream.limit work as expected in this snippet?
                            
                                How to combine multiple PNGs into one big PNG file?
                            
                                is ObjectDB production ready?
                            
                                publishing failed with multiple errors eclipse
                            
                                Converting `BufferedImage` to `Mat` in OpenCV
                            
                                Cannot Resolve ContextCompat in Android
                            
                                Is it possible to create an instance of inner class using Java Reflection?
                            
                                Get GMT Time in Java
                            
                                Disabling inherited method on derived class
                            
                                Tomcat & Spring Web - Class Not Found Exception org.springframework.web.context.ContextLoaderListener
                            
                                Adding ListView Sub Item Text in Android
                            
                                Spring MVC: How to return custom 404 errorpages?
                            
                                Making Fibonacci faster [duplicate]
                            
                                Java interfaces and return types
                            
                                Remove duplicates from List using Guava
                            
                                Auto-generate Javadoc comments in Eclipse?
                            
                                What is proxy in the context of load() method of Hibernate?
                            
                                Is it reasonable to synchronize on a local variable?
                            
                                Return type from a Comparator
                            
                                Required Multiple beans of same type in Spring

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is an index in Elasticsearch

Tags:

java

ruby

full-text-search

elasticsearch