I am very new to elasticsearch and its scaling, and I've got a question I don't even know how to approach.
Here's the situation:
There're several servers with Rails microservice applications. Each of them is getting each its own pretty big piece of data (more specifically, aggregating posts from different social networks - so the indexable search fields are the same in all databases).
I need to find a solution that would allow to keep the data where it currently is and setting up an elasticsearch server dedicated exclusively to searching through multiple databases without the respective Rails apps turning on this search server. It potentially means setting up ES on each of the other servers, defining the search patterns there but making the multiple-model search on a totally different server.
The final goal of these manipulations should be sending the entire ActiveRecord objects / or all the related attributes to the main application.
Is it even possible to achieve? Maybe anyone has had a similar problem?
I am a little lost about how to get started with it.
This question is a little broad, but I think I can at least point you in the right direction from what I understand. First, let me start by stating your problem as I understand it.
You have multiple databases populated by their own microservice each. Each database contains similar information that you want to be able to search over (i.e. author, body, title, etc.) You want an elasticsearch cluster that has access to the data in all of those databases and can return a result that includes the correct database and document that matches a search.
Elasticsearch is very powerful when it comes to handling complicated cases like this. Since all of your data has a similar structure and fields you can just use one index with additional fields on it to store which DB the document comes from and the document ID from that DB. This will allow you to perform searches such as 'Give me every post made by William Shatner across these 3 social networks.'
You will need several additional pieces of functionality to make this work. First, you need a mechanism for getting the data from the database into the search index. On my team, we use a separate IndexingService that knows how to read event streams and send the live data to the ES index. You just need to decide on an indexing strategy (i.e. how often do you update the index with new entries?). Secondly, you will need some logic on the client side to take the raw search result and retrieve the relevant entry from the database.
This is just one way to solve your problem. If you instead want an approach that allows you to maintain a different index for each social network, but still has a central place you can search across them all I suggest looking into using Elasticsearch Tribe Nodes. Basically, it is a single place to submit a search that knows about every search cluster and how to interact with them to return a unified search result.
The best way to learn elasticsearch is to just get a cluster up and running and start experimenting! Good luck!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With