Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ElasticSearch multi percolate performance

I have a dedicated index for percolators only.
There are 3000 queries there. Here's a typical query:

 {
    "index": "articles_percolators",
    "type": ".percolator",
    "body": {
        "query": {
            "filtered": {
                "query": {
                    "bool": {
                        "should": [
                            {
                                "query_string": {
                                    "fields": [
                                        "title"
                                    ],
                                    "query": "Cities|urban|urbanization",
                                    "allow_leading_wildcard": false
                                }
                            },
                            {
                                "query_string": {
                                    "fields": [
                                        "content"
                                    ],
                                    "query": "Cities|urban|urbanization",
                                    "allow_leading_wildcard": false
                                }
                            },
                            {
                                "query_string": {
                                    "fields": [
                                        "url"
                                    ],
                                    "query": "Cities|urban|urbanization",
                                    "allow_leading_wildcard": false
                                }
                            }
                        ]
                    }
                },
                "filter": {
                    "bool": {
                        "must": [
                            {
                                "terms": {
                                    "feed_id": [
                                        3215,
                                        3216,
                                        10674,
                                        26041
                                    ]
                                }
                            }
                        ]
                    }
                }
            }
        },
        "sort": {
            "date": {
                "order": "desc"
            }
        },
        "fields": [
            "_id"
        ]
    },
    "id": "562"
}

Mapping (PHP array). Filters, analyzers and tokenizers are excluded for brevity:

    'index' => 'articles_percolators',
    'body' => [
        'settings' => [
            'number_of_shards' => 8,
            'number_of_replicas' => 0,
            'refresh_interval' => -1,
            'analysis' => [
                'filter' => [
                ],
                'analyzer' => [
                ],
                'tokenizer'=> [
                ]
            ]
        ],
        'mappings' => [
            'article' => [
                '_source' => ['enabled' => false],
                '_all' => ['enabled' => false],
                '_analyzer' => ['path' => 'lang_analyzer'],
                'properties' => [
                    'lang_analyzer' => [
                        'type' => 'string',
                        'doc_values' => true,
                        'store' => false,
                        'index' => 'no'
                    ],
                    'date' => [
                        'type' => 'date',
                        'doc_values' => true
                    ],
                    'feed_id' => [
                        'type' => 'integer'
                    ],
                    'feed_subscribers' => [
                        'type' => 'integer'
                    ],
                    'feed_canonical' => [
                        'type' => 'boolean'
                    ],
                    'title' => [
                        'type' => 'string',
                        'store' => false,
                    ],
                    'content' => [
                        'type' => 'string',
                        'store' => false,
                    ],
                    'url' => [
                        'type' => 'string',
                        'analyzer' => 'simple',
                        'store' => false
                    ]
                ]
            ]
        ]
    ]

I am then sending documents to the mpercolate API, 100 at a time. Here's a part (1 document) of the mpercolate request:

{
    "percolate": {
        "index": "articles_percolators",
        "type": "article"
    }
},
{
    "doc": {
        "title": "Win a Bench Full of Test Equipment",
        "url": "\/document.asp",
        "content": "Keysight Technologies is giving away a bench full of general-purpose test equipment.",
        "date": 1421194639401,
        "feed_id": 12240778,
        "feed_subscribers": 52631,
        "feed_canonical": 1,
        "lang_analyzer": "en_analyzer"
    }
}

100 articles are processed in ~1 second on my MacBook Pro 2.4 GHz Intel Core i7 (4 cores, 8 with HT) with all cores at maximum:

ES percolator utilizing all cores

This seems rather slow to me, but I don't have a base to compare with.
I have a regular index with the same mapping (but with 6 shards) with over 3 Billion documents (still) living on a single server with 24 core Xeon and 128GB RAM. Such queries search across the whole index in less than 100ms (on a hot server).

Is there something obviously wrong in my setup and did anyone else benchmarked the performance of percolators? I didn't find anything else in the web about this...

My ES version is 1.4.2 with default configuration and workload is completely CPU bound.

Edit

Because John Petrone's comment is right about testing on a production environment I have made the test on the same 24 core Xeon as we use in production. The result with 8 shards index for percolation is the same if not even worse. The times are somewhere between 1s and 1.2s while the network latency there is lower than my laptop's.
This can probably be explained by the slower clock speed per core for the Xeon - 2.0GHz vs 2.4Ghz for the i7.

It results in almost constant CPU utilization of around 800%:

CPU utilization with 8-shard index

I have then recreated the index with 24 shards and times have dropped to 0.8s per 100 documents, but with more than double the CPU time:

CPU utilization with 24-shard index

I have a constant flow of around 100 documents per second and the number of queries will rise in the future, so this is somewhat a concern for me.

like image 323
Jacket Avatar asked Mar 17 '23 08:03

Jacket


1 Answers

So just to be clear, you can't compare normal Elasticsearch performance on a 24 core Xeon with 128GB memory against ES percolate performance on a laptop - very different hardware and very different software.

With many large index setups (like your's with 3 billion docs) you tend to be either disk or memory bound when running queries. As long as you have enough of both, query performance can be quite high.

Percolation is different - you are in effect indexing each document and then running each query stored in the percolator against each document, all in in-memory Lucene indexes:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-percolate.html

Percolation scales horizontally and tends to be cpu bound - you scale it by adding additional nodes with sufficient cpu.

With 100 documents submitted via the multi percolate api against 3000 registered percolate queries you are basically running 300,000 individual queries. I would expect that to be cpu bound on a Macbook - I think you'd be better off benchmarking this in an environment that's more controlled (separate server) and one that you can scale by adding additional nodes.

UPDATE

So to get a better idea of what the bottleneck is and how to improve your performance your going to need to start with lower numbers of registered queries and lower numbers of documents at a time and then ratchet up. This will give you a much clearer picture of what's going on behind the scenes.

I'd start with a single document (not 100) and far fewer queries registered and then run a series of tests, some raising the number of documents, some raising the number of queries registered, in multiple steps and then go above 100 documents and a time and above 3000 queries.

By looking at the results you will get a better idea of how performance declines vs. load - linear with number of documents, linear with number of registered queries.

Other variants of configuration I would try - instead of 100 docs via the bulk percolate api, try the single doc api in multiple threads (to see if it's a multi doc api issue). I'd also try running multiple nodes on the same system, or use many smaller servers, to see if you get better performance across multiple smaller nodes. I'd also vary the amount of memory allocated to the JVM (more is not necessarily better).

Ultimately you want a range of data points to try to identify how your queries scale and where the inflection points are.

like image 55
John Petrone Avatar answered Mar 25 '23 01:03

John Petrone