I've been working with the Apache Mahout machine learning libaries in my free time a bit over the past few weeks. I'm curious to hear about how others are using these libraries.
Mahout offers the coder a ready-to-use framework for doing data mining tasks on large volumes of data. Mahout lets applications to analyze large sets of data effectively and in quick time. Includes several MapReduce enabled clustering implementations such as k-means, fuzzy k-means, Canopy, Dirichlet, and Mean-Shift.
Apache Mahout is a highly scalable machine learning library that enables developers to use optimized algorithms. Mahout implements popular machine learning techniques such as recommendation, classification, and clustering. Therefore, it is prudent to have a brief section on machine learning before we move further.
There are many supervised learning algorithms such as neural networks, Support Vector Machines (SVMs), and Naive Bayes classifiers. Mahout implements Naive Bayes classifier.
This article is fairly thorough and has good examples: https://www.ibm.com/developerworks/java/library/j-mahout/index.html
I used Mahout to implement a distributed recommender system. Our project was to develop a mobile platform to create collaborative filtering recommendations. We gathered data from the users and then used Mahout + Hadoop to make the recommendations.
The projects is described here
http://ceur-ws.org/Vol-676/paper9.pdf
http://egc2012.labri.fr/abstracts/139.pdf
And some of it in my blog here: http://girlincomputerscience.blogspot.com.br/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With