Is it possible to use Apache mahout without any dependency to Hadoop.
I would like to use the mahout algorithm on a single computer by only including the mahout library inside my Java project but i dont want to use hadoop at all since i will be running on a single node anyway.
Is that possible?
Mahout operates in addition to Hadoop, which allows you to apply the concept of machine learning via a selection of Mahout algorithms to distributed computing via Hadoop. Mahout's core algorithms include recommendation mining, clustering, classification, and frequent item-set mining.
Apache Mahout is a highly scalable machine learning library that enables developers to use optimized algorithms. Mahout implements popular machine learning techniques such as recommendation, classification, and clustering. Therefore, it is prudent to have a brief section on machine learning before we move further.
MLlib provides ultimate performance gains to data scientists and is 10 to 100 times faster than Hadoop and Apache Mahout.
Yes. Not all of Mahout depends on Hadoop, though much does. If you use a piece that depends on Hadoop, of course, you need Hadoop. But for example there is a substantial recommender engine code base that does not use Hadoop.
You can embed a local Hadoop cluster/worker in a Java program.
Definitely, yes. In the Mahout Recommender First-Timer FAQ they advise against starting out with a Hadoop-based implementation (unless you know you're going to be scaling past 100 million user preferences relatively quickly).
You can use the implementations of the Recommender interface in a pure-Java fashion relatively easily. Or place one in the servlet of your choice.
Technically, Mahout has a Maven dependency on Hadoop. But you can use recommenders without the Hadoop JARs easily. This is described in the first few chapters of Mahout in Action - you can download the sample source code and see how it's done - look at the file RecommenderIntro.java
.
However, if you're using Maven, you would need to exclude Hadoop manually - the dependency would look like this:
<dependency>
<groupId>org.apache.mahout</groupId>
<artifactId>mahout-core</artifactId>
<exclusions>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
</exclusion>
</exclusions>
</dependency>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With