Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Questions on Upgrading Lucene from 2.2 to 2.9 to 3.1

Tags:

java

lucene

I have an existing piece of software that is using Lucene 2.2.x and I need to upgrade to 3.1. To do this upgrade I've read documentation that suggests upgrading to 2.9.x first, removing all deprecation warnings, then upgrade to 3.1.x. I have existing indexes deployed that I need to keep the code compatible with.

My main question comes around handling dates. In 2.2.x I had to use DateTools.dateToString() to convert Date.getTime() into a string that I could index and store. I created two fields on every document. One for searching which was stored with Hour resolution, and the other field which was not analyzed. Now Lucene 2.9.x supports different other data types than string. Can these new types be used in RangeQueries if it's against a prior version that used DateTools to convert dates to strings? Here is the code I changed it too:

Before:

return new RangeFilter("dateArchived-stored",
                DateTools.dateToString(start, DateTools.Resolution.MILLISECOND),
                DateTools.dateToString(end, DateTools.Resolution.MILLISECOND),
                false, true );

After:

return NumericRangeFilter.newLongRange("dateArchived-stored", 
                                       start.getTime(), 
                                       end.getTime(), true, true );

Now that Lucene supports non-string data types do we need to be concerned with resolution of dates as we did with Term queries?

IndexWriter requires declaring a MaxFieldLimit. Prior versions didn't. Is using UNLIMITED the same behavior as in prior versions? Is it safest to use UNLIMITED given that there are indexes I'll be reading that were created with 2.2?

Before:

new IndexWriter( indexDirectory, analyzer )

After:

new IndexWriter( FSDirectory.open(indexDirectory), analyzer, true, IndexWriter.MaxFieldLength.UNLIMITED )

Sort objects requires a SortField declaration which requires a type of that field. For existing fields indexed with 2.2.x versions can we declare fields previously indexed as String to another type, or should they always be SortField.STRING?

Before:

new Sort("timestamp", false )

After:

new Sort(new SortField("timestamp", SortField.LONG, false) )

Will this work with indexes built in 2.2.x, but read by 2.9.x?

Finally, will I have any issues with going straight to 3.1.x with indexes built in 2.2.x? I'm going this transition to 2.9.x on my local dev system, but in the field it will be going from 2.2.x straight to 3.1.x. Will I have to release a version using 2.9.x?

like image 525
chubbsondubs Avatar asked Sep 12 '11 04:09

chubbsondubs


1 Answers

"Is it safest to use UNLIMITED". yes. that option has nothing to do with already created documents.

If you have string fields you can't use numeric range on them. You can check this on your own.

like image 116
mihaicc Avatar answered Oct 22 '22 14:10

mihaicc