Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Indexing and Searching Date in Lucene

Tags:

java

lucene

I tried it to index date with DateTools.dateToString() method. Its working properly for indexing as well as searching.

But my already indexed data which has some references is in such a way that it has indexed Date as a new Date().getTime().

So my problem is how to perform RangeSearch Query on this data...

Any solution to this???

Thanks in Advance.

like image 612
user660024 Avatar asked Mar 31 '11 05:03

user660024


People also ask

How does Lucene index search work?

Simply put, Lucene uses an “inverted indexing” of data – instead of mapping pages to keywords, it maps keywords to pages just like a glossary at the end of any book. This allows for faster search responses, as it searches through an index, instead of searching through text directly.

How do you search in Lucene?

Lucene supports single and multiple character wildcard searches within single terms (not within phrase queries). To perform a single character wildcard search use the "?" symbol. To perform a multiple character wildcard search use the "*" symbol. You can also use the wildcard searches in the middle of a term.

What is a Lucene index?

A Lucene Index Is an Inverted Index Lucene indexes terms, which means that Lucene search searches over terms. A term combines a field name with a token. The terms created from the non-text fields in the document are pairs consisting of the field name and the field value.


2 Answers

You need to use a TermRangeQuery on your date field. That field always needs to be indexed with DateTools.dateToString() for it to work properly. Here's a full example of indexing and searching on a date range with Lucene 3.0:

public class LuceneDateRange {
    public static void main(String[] args) throws Exception {
        // setup Lucene to use an in-memory index
        Directory directory = new RAMDirectory();
        Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_30);
        MaxFieldLength mlf = MaxFieldLength.UNLIMITED;
        IndexWriter writer = new IndexWriter(directory, analyzer, true, mlf);

        // use the current time as the base of dates for this example
        long baseTime = System.currentTimeMillis();

        // index 10 documents with 1 second between dates
        for (int i = 0; i < 10; i++) {
            Document doc = new Document();
            String id = String.valueOf(i);
            String date = buildDate(baseTime + i * 1000);
            doc.add(new Field("id", id, Store.YES, Index.NOT_ANALYZED));
            doc.add(new Field("date", date, Store.YES, Index.NOT_ANALYZED));
            writer.addDocument(doc);
        }
        writer.close();

        // search for documents from 5 to 8 seconds after base, inclusive
        IndexSearcher searcher = new IndexSearcher(directory);
        String lowerDate = buildDate(baseTime + 5000);
        String upperDate = buildDate(baseTime + 8000);
        boolean includeLower = true;
        boolean includeUpper = true;
        TermRangeQuery query = new TermRangeQuery("date",
                lowerDate, upperDate, includeLower, includeUpper);

        // display search results
        TopDocs topDocs = searcher.search(query, 10);
        for (ScoreDoc scoreDoc : topDocs.scoreDocs) {
            Document doc = searcher.doc(scoreDoc.doc);
            System.out.println(doc);
        }
    }

    public static String buildDate(long time) {
        return DateTools.dateToString(new Date(time), Resolution.SECOND);
    }
}
like image 184
WhiteFang34 Avatar answered Nov 15 '22 20:11

WhiteFang34


You'll get much better search performance if you use a NumericField for your date, and then NumericRangeFilter/Query to do the range search.

You just have to encode your date as a long or int. One simple way is to call the .getTime() method of your Date, but this may be far more resolution (milli-seconds) than you need. If you only need down to the day, you can encode it as YYYYMMDD integer.

Then, at search time, do the same conversion on your start/end Dates and run NumericRangeQuery/Filter.

like image 32
Michael McCandless Avatar answered Nov 15 '22 18:11

Michael McCandless