Are index aliases and wildcard index endpoints in Elasticsearch exactly the same thing?

Tags:

The Elasticsearch documentation for Index Aliases says:

The index aliases API allow to alias an index with a name, with all APIs automatically converting the alias name to the actual index name. An alias can also be mapped to more than one index, and when specifying it, the alias will automatically expand to the aliases indices.

And the documentation for Multiple Indices says:

Most APIs that refer to an index parameter support execution across multiple indices, using simple test1,test2,test3 notation (or _all for all indices). It also support wildcards, for example: test*, and the ability to "add" (+) and "remove" (-), for example: +test*,-test3.

Scenario #1

You have 12 monthly indices from the year 2014 each named with a date pattern, e.g. someprefix_2014-07
You map all of these indices to an alias named 2014.
Both of these requests would return the same result:
- $ curl -XGET http://localhost:9200/someprefix_2014-*/_stats
- $ curl -XGET http://localhost:9200/2014/_stats

Scenario #2

You have a total of 24 monthly indices in your cluster and you decide you want to target all of them.
All of these requests would return the same result:
- $ curl -XGET http://localhost:9200/_stats
- $ curl -XGET http://localhost:9200/_all/_stats
- $ curl -XGET http://localhost:9200/*/_stats
- $ curl -XGET http://localhost:9200/someprefix_*/_stats

My Question

Are all of these methods doing the same thing "under the hood", or is there one that may expect better performance than the others?

I ask because I've read about Wildcard Queries being a common performance bottleneck, but I've never seen any similar warning for using aliases or wildcards in index endpoints - or distinguishing default aliases (like _all) from custom ones.

672

asked Apr 01 '15 21:04

Frankie Jarrett

1 Answers

They aren't exactly the same, from a code execution perspective. But they are functionally identical and will have identical performance profiles.

Aliases are really just "tags" that are attached to existing indices. So when you search against the 2014 alias, Elasticsearch just scans through the list of indices in the cluster state and finds all indices that are tagged with that alias.

When you search against a wildcard index pattern, it scans through the list of indices to see which names match the regex.

So performance will basically be the same, because the actual search is entirely unaffected: the shards associated with those searches will be queried no matter what, and all the index-to-shard lookups will happen on the coordinating node very quickly, no matter the method used.

So don't worry, you can choose whichever makes more sense for you :)

PS. Wildcard queries are discouraged because they do have performance implications. They have to generate and check a large number of potential tokens, which can have non-negligible impact on latency. But they are very different from index wildcards, or many other wildcards around ES. Most things that support pattern matching / wildcards in ES are simply Java regex, whereas the wildcard query is fancy automaton magic inside of Lucene against inverted indices...much different :)

110

answered Nov 05 '22 11:11

Zach

Related questions
                            
                                Can I allocate objects contiguously in java?
                            
                                Scala Parser Combinator, Ambiguous Grammar & Parse Forest
                            
                                Efficiency of line by line file reading in Python
                            
                                Seq.map faster than a regular for loop?
                            
                                Is there a LINQ extension or (a sensible/efficient set of LINQ entensions) that determine whether a collection has at least 'x' elements?
                            
                                Reasons to not pass simple types by reference?
                            
                                Impacts of having unused JAR files in CLASSPATH
                            
                                Quick HTML Table Sorting?
                            
                                Fortran allocate/deallocate
                            
                                performance in scala logging libraries call-by-value vs call-by-name
                            
                                Java verbose:gc How to read the output?
                            
                                Java bytecode "excessive" number of dup considered "poor" code?
                            
                                How efficient can Haskell state be compared to C++, for very stateful games/simulations?
                            
                                Can I use placement new to reset an object within a shared_ptr?
                            
                                Getting android contacts details very slow
                            
                                What is the fastest way to compare strings in JavaScript?
                            
                                Fast string to byte[] conversion
                            
                                D3js force layout destroy and reset
                            
                                Cost of each class in Java application - Fewer huge classes or Several smaller ones
                            
                                Run a R function with multiple parameters in parallel mode

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Are index aliases and wildcard index endpoints in Elasticsearch exactly the same thing?

Tags:

performance

rest

indexing

elasticsearch

Frankie Jarrett

People also ask

1 Answers

Zach

Recent Activity

Donate For Us