Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch, Nest and Lucene.net

I know that Elasticsearch is based on Lucene but I wonder if Elasticsearch gives me any benefits developing a search engine rather than coding with Lucene.Net directly. Sorry, If question is a bit simple but I am confusing after searching the possibilities for creating a search engine.

I found more examples for simple lucene.net search but not many for Elasticsearch and Nest. Another question is what is the difference between Nest and Elasticsearch indeed? are they same?

if someone throws me some light here, maybe with a nice sample, I appreciate. what I need is? Easy, quick and fast search engine. what would be the best option? any other alternative can be also but only .net (c# or vb) thanks.

like image 777
Emil Avatar asked Apr 23 '15 09:04

Emil


2 Answers

Lucene

Lucene and the .NET port, Lucene.Net, is a search engine library for supporting full-text search in an application; it builds an inverted index based on the Document (and the fields within the Document) that you feed it to support full-text search. An example of this is search within the Nuget Gallery source, where a nuget package and its properties is converted to a document to pass to Lucene. The inverted index is stored across files within a directory.

Elasticsearch

Elasticsearch is a distributed search engine that uses Lucene under the covers - An Elasticsearch cluster can be made up of one or more nodes, where each node can contain a number of shards and replicas; each shard is a complete Lucene index. Having such infrastructure enables fast performance and allows horizontal scaling to handle search across a large amount of data since you are no longer limited by the constraints of a single Lucene index on a single machine. In addition you can achieve high availability with fault tolerance and disaster recovery since data can be replicated across shards meaning there is no single point of failure. An example of Elasticsearch with NEST is up on my blog.

Which to use?

Well, it depends on your use case (it nearly always does, right?); if your application is one that gets installed onto a machine and all data is persisted locally, you might decide to use Lucene library within the application and persist the index directory to local disk. Similarly, if you have a simple web application that runs on a single server with a small number of users then using Lucene may also be a sensible choice. On the other hand, if your application runs across multiple machines in a web farm and requires search capabilities, going with a distributed search engine like Elasticsearch would be a good idea.

How well does Elasticsearch scale? Back in 2013, Github was using Elasticsearch to index 2 billion documents i.e. all the code files in every repository on the site - across 44 separate Amazon EC2 instances, each with two terabytes of ephemeral SSD storage, giving a total of 30 terabytes of primary data. Stackoverflow also uses Elasticsearch to power search on this site (perhaps a dev could comment with some figures/metrics?)

like image 64
Russ Cam Avatar answered Oct 22 '22 13:10

Russ Cam


Lucene and Elasticsearch are two entirely different classes of applications.

Lucene is a library implementing an inverted index and search and ranking on it with a basic Lucene query language. It's not a standalone application that you can just run and use (to index docs, search them, retrieve them, ...).

Elasticsearch is a distributed server built on top of Lucene. Elasticsearch gives you a nice REST API which you can use to index, search, and retrieve your documents. It also implements a query language with features well beyond what Lucene can do on its own. It's also a distributed server - meaning that you can start an Elasticsearch server as a cluster on a number of machines and it will automatically take care of distributing and replicating data between them.

Similarly, Solr is also a search engine built on top of Lucene.

So it really depends on what exactly you wish to achieve. If it's just implementing a fulltext search feature embedded in an existing application then Lucene might be all you need. On the other hand, if you want to implement let's say a movie search engine for your website about movies then you'd be much better off using either Elasticsearch or Solr.

like image 23
Jakub Kotowski Avatar answered Oct 22 '22 15:10

Jakub Kotowski