Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lucene - string field which doesn't need to be indexed

Tags:

java

lucene

(Currently using Lucene 4.6).

Just wondering why it appears sort of undesirable to store text info in an org.apache.lucene.document.Document which is not indexed. TextField is indexed and tokenized. StringField is indexed but not tokenized.

But supposing you just need a String which accompanies the other info in your org.apache.lucene.document.Documents but will itself never be the subject of a query?

It's just that (in 4.6) org.apache.lucene.document.Field.Index has a "NO", meaning "Do not index the field value.", but this is currently "Deprecated".

Why? Is there a better way of having "inert" String info accompanying your indexed (and possibly tokenized) more significant fields?

like image 976
mike rodent Avatar asked May 29 '16 17:05

mike rodent


1 Answers

(Nearly) 2 years later and I hopefully have a slightly better understanding of things.

The answer to this appears to be to use StoredField.

In fact TextField and StringField are stored in and retrieved from indices as StoredField. StoredField is "agnostic", as are TextField and StringField: all these are subclasses of Field, which has, among other methods, setStringValue, setIntValue, stringValue and numericValue!

With Lucene 6 there is no IntField ... but there is a "red herring" called IntPoint. It's a red herring because it cannot be stored in an index... ever!

In fact if you need to store an Integer you (I think) need to use either StoredField or (if it needs to be indexed) StringField.

See this recent answer of mine.

like image 118
mike rodent Avatar answered Sep 30 '22 06:09

mike rodent