Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

is it possible to use apache mahout without hadoop dependency?

Is it possible to use Apache mahout without any dependency to Hadoop.

I would like to use the mahout algorithm on a single computer by only including the mahout library inside my Java project but i dont want to use hadoop at all since i will be running on a single node anyway.

Is that possible?

like image 904
skyde Avatar asked Oct 19 '11 00:10

skyde


People also ask

How does Mahout work with Hadoop?

Mahout operates in addition to Hadoop, which allows you to apply the concept of machine learning via a selection of Mahout algorithms to distributed computing via Hadoop. Mahout's core algorithms include recommendation mining, clustering, classification, and frequent item-set mining.

How does Apache Mahout work?

Apache Mahout is a highly scalable machine learning library that enables developers to use optimized algorithms. Mahout implements popular machine learning techniques such as recommendation, classification, and clustering. Therefore, it is prudent to have a brief section on machine learning before we move further.

How many times faster is MLlib vs Apache Mahout?

MLlib provides ultimate performance gains to data scientists and is 10 to 100 times faster than Hadoop and Apache Mahout.


2 Answers

Yes. Not all of Mahout depends on Hadoop, though much does. If you use a piece that depends on Hadoop, of course, you need Hadoop. But for example there is a substantial recommender engine code base that does not use Hadoop.

You can embed a local Hadoop cluster/worker in a Java program.

like image 69
Sean Owen Avatar answered Sep 27 '22 17:09

Sean Owen


Definitely, yes. In the Mahout Recommender First-Timer FAQ they advise against starting out with a Hadoop-based implementation (unless you know you're going to be scaling past 100 million user preferences relatively quickly).

You can use the implementations of the Recommender interface in a pure-Java fashion relatively easily. Or place one in the servlet of your choice.

Technically, Mahout has a Maven dependency on Hadoop. But you can use recommenders without the Hadoop JARs easily. This is described in the first few chapters of Mahout in Action - you can download the sample source code and see how it's done - look at the file RecommenderIntro.java.

However, if you're using Maven, you would need to exclude Hadoop manually - the dependency would look like this:

<dependency>
        <groupId>org.apache.mahout</groupId>
        <artifactId>mahout-core</artifactId>
        <exclusions>
            <exclusion>
                <groupId>org.apache.hadoop</groupId>
                <artifactId>hadoop-core</artifactId>
            </exclusion>
        </exclusions>
</dependency>
like image 35
Eyal Avatar answered Sep 27 '22 19:09

Eyal