We are planning to introduce Elastic search(AWS) for our Multi tenancy application. We have below options,
As per this blog https://www.elastic.co/blog/found-multi-tenancy the first option would give memory issue. But not clear about other options.
It seems if we are using the third option then there is no data segregation. Not sure about security.
I believe second option would be better option as data would be segregated.
Help me to identify best option to proceed elastic search with Multi tenancy.
Please note that we would leverage AWS infrastructure.
Elasticsearch is built on top of the Apache Lucene search library and provides a robust, scalable platform for running search and analytics applications. Multi-tenancy in Elasticsearch refers to the ability to support multiple tenants, or users, on a single instance of Elasticsearch.
Multitenancy is a software architecture where a single software instance can serve multiple, distinct user groups. Software-as-a-service (SaaS) offerings are an example of multitenant architecture.
In this purely infrastructure-focused view, multi-tenancy is used to describe how resources are shared by tenants to promote agility and cost efficiency. Suppose, for example, you have a microservice or an Amazon Elastic Compute Cloud (Amazon EC2) instance that is consumed by multiple tenants of your SaaS system.
Multitenancy is a reference to the mode of operation of software where multiple independent instances of one or multiple applications operate in a shared environment. The instances (tenants) are logically isolated, but physically integrated.
We are considering the same question right now, and the following set of articles by Elasticsearch was very helpful.
Start here: https://www.elastic.co/guide/en/elasticsearch/guide/current/scale.html
And read through each subsequent article until you hit this one: https://www.elastic.co/guide/en/elasticsearch/guide/current/finite-scale.html
The following two were very eye-opening for me:
https://www.elastic.co/guide/en/elasticsearch/guide/current/faking-it.html https://www.elastic.co/guide/en/elasticsearch/guide/current/one-big-user.html
The basic takeaway:
This is a too important link not to be mentioned here: http://www.bigeng.io/elasticsearch-scaling-multitenant/
Good architecture dilemmas, and great performance analysis / reasoning.
tldr; they had index groups that are built around shard allocation filtering to segregate load across nodes in the cluster
To sum up accepted answer and other articles,
Use a shared index using custom routing using an alias
1.1) Special case: Big client can have dedicated index, only if needed.
Following article covers many use cases for detailed explanation. https://www.elastic.co/blog/found-multi-tenancy
Following is the conclusion on how you can do it (link source: accepted answer) https://www.elastic.co/guide/en/elasticsearch/guide/current/faking-it.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With