Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using CLucene vs java lucene

Tags:

lucene

clucene

I am currently using Java lucene for one of the project and getting OK kind of performance. I am looking for C/C++ option for lucene and came across CLucene on sourceforge.

But I wanted to check if CLucene is as stable and reliable as Java lucene and having all features supported by Java Lucene, also is it apache licensed and actively supported ? if YES why I dont have option to download CLucene on apache Lucene site (on apache lucene site I have lucene.net option though).

Would like to understand more on usage of CLucene for enterprise software.

like image 635
Rushik Avatar asked Feb 17 '12 07:02

Rushik


1 Answers

CLucene is available under the Apache License v2.0 and is hosted at sourceforge. It is not downloadable from Lucene website because CLucene is an independant project. However, Lucy, which is a C port of Lucene (targetting dynamic languages), is available from Lucene website because it is a sub-project of Lucene. Same applies for Lucene.NET.

Unless you are forced not to use a JVM language, I would recommend you use the Java version.

All developments are done for the Java version and then sometimes backported to other ports such as CLucene. As a consequence, lots of useful features are still only available in the Java version (for example function queries are not available in CLucene).

Regarding performance, C/C++ might sometimes be faster than Java, but there are a lot of pieces of code in the Java version which use very neat algorithms to improve performance, such as:

  • levenshtein automata for fuzzy queries,
  • a non-blocking flushing mechanism to improve indexing throughput.

Last but not least, the Java version is the most tested one and used in a lot of very high-traffic websites such as LinkedIn or Twitter.

like image 160
jpountz Avatar answered Nov 18 '22 05:11

jpountz