Indexing a Boolean value(true/false) in lucene(not need to store) I want to get more disk space usage and higher search performance
doc.add(new Field("boolean","true",Field.Store.NO,Field.Index.NOT_ANALYZED_NO_NORMS));
//or
doc.add(new Field("boolean","1",Field.Store.NO,Field.Index.NOT_ANALYZED_NO_NORMS));
//or
doc.add(new NumericField("boolean",Integer.MAX_VALUE,Field.Store.NO,true).setIntValue(1));
Which should I choose? Or any other better way?
thanks a lot
An interesting question!
If I was faced with this, I think I would choose option one ("true" and "false" terms), if it influences the final decision.
Your choice of NOT_ANALYZED_NO_NORMS
looks good, I think.
Lucene jumps through an elaborate set of hoops to make NumericField searchable by NumericRangeQuery, so definitely avoid it an all cases where your values don't represent quantities. For example, even if you index an integer, but only as a unique ID, you would still want to use a plain String field. Using "true"/"false" is the most natural way to index a boolean, while using "1"/"0" gives just a slight advantage by avoiding the possibility of case mismatch or typo. I'd say this advantage is not worth much and go for true/false.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With