Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multi tenancy in Elastic Search

We are planning to introduce Elastic search(AWS) for our Multi tenancy application. We have below options,

  1. Using One Index Per Tenant
  2. Using One Type Per Tenant
  3. All Tenants Share One Index with Custom routing

As per this blog https://www.elastic.co/blog/found-multi-tenancy the first option would give memory issue. But not clear about other options.

It seems if we are using the third option then there is no data segregation. Not sure about security.

I believe second option would be better option as data would be segregated.

Help me to identify best option to proceed elastic search with Multi tenancy.

Please note that we would leverage AWS infrastructure.

like image 356
Selvakumar Ponnusamy Avatar asked Jan 26 '17 06:01

Selvakumar Ponnusamy


People also ask

What is multi-tenancy in Elasticsearch?

Elasticsearch is built on top of the Apache Lucene search library and provides a robust, scalable platform for running search and analytics applications. Multi-tenancy in Elasticsearch refers to the ability to support multiple tenants, or users, on a single instance of Elasticsearch.

What is multi-tenancy explain with example?

Multitenancy is a software architecture where a single software instance can serve multiple, distinct user groups. Software-as-a-service (SaaS) offerings are an example of multitenant architecture.

What is multi-tenancy in AWS?

In this purely infrastructure-focused view, multi-tenancy is used to describe how resources are shared by tenants to promote agility and cost efficiency. Suppose, for example, you have a microservice or an Amazon Elastic Compute Cloud (Amazon EC2) instance that is consumed by multiple tenants of your SaaS system.

What is multi-tenancy model?

Multitenancy is a reference to the mode of operation of software where multiple independent instances of one or multiple applications operate in a shared environment. The instances (tenants) are logically isolated, but physically integrated.


3 Answers

We are considering the same question right now, and the following set of articles by Elasticsearch was very helpful.

Start here: https://www.elastic.co/guide/en/elasticsearch/guide/current/scale.html

And read through each subsequent article until you hit this one: https://www.elastic.co/guide/en/elasticsearch/guide/current/finite-scale.html

The following two were very eye-opening for me:

https://www.elastic.co/guide/en/elasticsearch/guide/current/faking-it.html https://www.elastic.co/guide/en/elasticsearch/guide/current/one-big-user.html

The basic takeaway:

  • Alias per customer
  • Shard routing
  • Now you can have indexes for big customers, shared indexes for little customers, and they all appear to be separate indices
like image 196
jzheaux Avatar answered Oct 19 '22 00:10

jzheaux


This is a too important link not to be mentioned here: http://www.bigeng.io/elasticsearch-scaling-multitenant/

Good architecture dilemmas, and great performance analysis / reasoning.

tldr; they had index groups that are built around shard allocation filtering to segregate load across nodes in the cluster

like image 8
Froyke Avatar answered Oct 18 '22 23:10

Froyke


To sum up accepted answer and other articles,

  1. Use a shared index using custom routing using an alias

    1.1) Special case: Big client can have dedicated index, only if needed.

Following article covers many use cases for detailed explanation. https://www.elastic.co/blog/found-multi-tenancy

Following is the conclusion on how you can do it (link source: accepted answer) https://www.elastic.co/guide/en/elasticsearch/guide/current/faking-it.html

like image 1
Anonymous Creator Avatar answered Oct 19 '22 00:10

Anonymous Creator