How datas are stored in lucene

Tags:

lucene

I know that lucene creates an index and stores all the data .Can any one tell me how the data is stored in flat file? or what kind of algorithms they use to store the data in backend so that they can retrieve it quickly?

288

asked Feb 01 '12 07:02

Ramesh

2 Answers

Don't know if this is what you asked for. But the more general answer is that they use/implement a Inverted Index. The specifics of how Lucene stores it you can find in file formats (as milan said).

But the general idea is that they store a Inverted Index data structure and other auxiliar data structures to help answer queries quickly. For example, it stores a vector of norms for each document and each term's IDF (inverse document frequency). Lucene also stores the actual document fields, but that is outside the Inverted Index.

168

answered Oct 18 '22 22:10

Felipe Hummel

You can find all that explained in the file formats section.

answered Oct 18 '22 21:10

milan

Related questions
                            
                                What is the difference between a phrase query and using a shingle filter?
                            
                                ElasticSearch default scoring mechanism
                            
                                What are norms in Lucene
                            
                                LockObtainFailedException updating Lucene search index using solr
                            
                                How to use StandardTokenizer from lucene 5.x.x
                            
                                Elasticsearch : when to set omit_norms option as false
                            
                                Lucene queryparser with "/" in query criteria
                            
                                Java JDK BitSet vs Lucene OpenBitSet
                            
                                Forward Index vs Inverted index Why?
                            
                                Lucene: how to boost some specific field
                            
                                MongoDB full text search vs Lucene? [closed]
                            
                                Azure Search: price range - min & max value calculation
                            
                                Relevance feedback in Apache Solr
                            
                                Precision recall in lucene java
                            
                                How to make sure Solr/Lucene won't die with java.lang.OutOfMemoryError?
                            
                                How to index source code with ElasticSearch
                            
                                Multiple or single index in Lucene?
                            
                                Diversified results on Elasticsearch search
                            
                                Analyzer for Russian language in Lucene and Lucene.Net
                            
                                Enabling soundex/metaphone for non-English characters

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With