Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Lucene like a relational database

Tags:

I am just wondering if we could achieve some RDBMS capabilities in lucene.

Example: 1) I have 10,000 project documents (pdf files) which have to be indexed with their content to make them available for search. 2) Every document is related to a SINGLE PROJECT. The project can contain details like project name, number, start date, end date, location, type etc.

I have to search in the contents of the pdf files for a given keyword, but while displaying the results I want to display the project meta data as mentioned in point (2).

My idea is to associate a field called projectId with each pdf file while indexing. Once we get that, we will fire search again for getting project meta data.

This way we could avoid duplicated data. Also, if we want to update the project meta data we will end up updating at a SINGLE PLACE only. Otherwise if we store this meta data with all the pdf doument indexes, we will end up updating all of the documents, which is not the way I am looking for.

please advise.

like image 670
KP. Avatar asked May 06 '09 09:05

KP.


People also ask

Does Lucene use a database?

Lucene is not a database — as I mentioned earlier, it's just a Java library.

Is Lucene a NoSQL database?

Apache Solr is a subproject of Apache Lucene, which is the indexing technology behind most recently created search and index technology. Solr is a search engine at heart, but it is much more than that. It is a NoSQL database with transactional support.

Is MongoDB based on Lucene?

Amazon and MongoDB both use Lucene every day, and the most important use case is no doubt application search, in which the engine is primarily used by humans.

Is Google based on Lucene?

Both are better suited for developing a search engine and both are based on Lucene.


1 Answers

If I understand you correctly, you have two questions:

  1. Can I store a project id in Lucene and use it for further searches? Yes, you can. This is a common practice.
  2. Can I use this project id to search Lucene for project meta data? Yes, you can. I do not know if this is a good idea. It depends on the frequency of your meta data updates and your access pattern. If the meta data is relatively static, and you only access it by id, Lucene may be a good place to store it. Otherwise, you can use the project id as a primary key to a database table, which could be a better fit.
like image 198
Yuval F Avatar answered Oct 05 '22 14:10

Yuval F