A bigger question is will solr even be able to support this? I know I have seen lucene be able to do this and solr is built on lucene.
I have seen an example somewhere using google but can't seem to find it again, and the example was not complete in that I don't think it had the query portion on how I write my query statement for lucene. I remember seeing a NumericField and there is this NumericComparator.
Basically, I am trying a noSQL orm solution that offers indexing(on github) (though the client decides how many indexes per table and the partitioning methodology but you add entites to the index and remove them yourself and can use namedQueries though you have to get the index by name first before the query since one table may have millions of indexes). The two main things I want to achieve are that it all works with an in-memory nosql fake db and an in-memory index(lucene's RAMDirectory) AND then I want to switch those to plugging in cassandra and SOLR.
I basically need to
Right now, if you need more details the main Query code of the project is found at https://github.com/deanhiller/nosqlORM/blob/master/input/javasrc/com/alvazan/orm/layer3/spi/index/inmemory/MemoryIndexWriter.java
and on line 172 you can see I am adding a new Field every time but unfortunately some of these may be ints.
BIG QUESTION: Can SOLR even support int vs. string? (IF not, I will have to go with the hack of padding 0's on the front of ints, longs etc. so all ints are the same length).
IF SOLR can support it, then in lucene what is the best way or is there a good example for this?
The main index interface retrieved from NoSqlEntityManager.getIndex(Class clazz, String indexPartitionName) is (though not sure it matters).. https://github.com/deanhiller/nosqlORM/blob/master/input/javasrc/com/alvazan/orm/api/Index.java
thanks, Dean
From the example SOLR schema.xml file:
<!--
Default numeric field types. For faster range queries, consider the tint/tfloat/tlong/tdouble types.
-->
<fieldType name="int" class="solr.TrieIntField" precisionStep="0" positionIncrementGap="0"/>
<fieldType name="float" class="solr.TrieFloatField" precisionStep="0" positionIncrementGap="0"/>
<fieldType name="long" class="solr.TrieLongField" precisionStep="0" positionIncrementGap="0"/>
<fieldType name="double" class="solr.TrieDoubleField" precisionStep="0" positionIncrementGap="0"/>
<!--
Numeric field types that index each value at various levels of precision
to accelerate range queries when the number of values between the range
endpoints is large. See the javadoc for NumericRangeQuery for internal
implementation details.
Smaller precisionStep values (specified in bits) will lead to more tokens
indexed per value, slightly larger index size, and faster range queries.
A precisionStep of 0 disables indexing at different precision levels.
-->
<fieldType name="tint" class="solr.TrieIntField" precisionStep="8" positionIncrementGap="0"/>
<fieldType name="tfloat" class="solr.TrieFloatField" precisionStep="8" positionIncrementGap="0"/>
<fieldType name="tlong" class="solr.TrieLongField" precisionStep="8" positionIncrementGap="0"/>
<fieldType name="tdouble" class="solr.TrieDoubleField" precisionStep="8" positionIncrementGap="0"/>
So if you index a field as one of those fieldtypes above, then query it via its fieldname (e.g. myIntField:1234
) it will do the "right thing" and you can also do range searches against it (myIntField:[1200 TO 1300]
). Same goes for floats, etc.
I think we can leverage org.apache.lucene.document.NumericField class. In this class, we can call set method, it can support int,log,float and double. For other data type (E.g. bool, datetime), we can do special convert to change them into int or long type.
BTW, I saw lucene's latest source code, involving new clases: FloatField, IntField, LongField adn DoubleField. It will be included in next release. http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/document/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With