Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

machine learning algorithms that can not apply map reduce model

The paper "Map-Reduce for Machine Learning on Multicore" shows 10 machine learning algorithms, which can benefit from map reduce model. The key point is "any algorithm fitting the Statistical Query Model may be written in a certain “summation form.”, and the algorithms can be expressed as summation form can apply map reduce programming model.

For those algorithms that could not be expressed as summation form do not mean that they can not apply map reduce model. Could anyone point out any specific machine learning algorithm, which can not speed up by map reduce model?

like image 370
user1841342 Avatar asked Nov 21 '12 09:11

user1841342


1 Answers

The MapReduce does not work when there are computational dependencies in the data. This limitation makes it difficult to represent algorithms that operate on structured models.

As a consequence, when confronted with large scale problems, we often abandon rich structured models in favor of overly simplistic methods that are amenable to the MapReduce abstraction 2.

In Machine-learning community, numerous algorithms iteratively transform parameters during both learning and inference, e.g., Belief Propagation, Expectation Maximization, Gradient Descent and Gibbs Sampling. Those algorithms iteratively refine a set of parameters until some termination criteria is matched 2.

If you invoke MapReduce in each iteration, yes, I think you still can speed up the computation. The point here is that we want a better abstraction framework so that it's possible to embrace the graphical structure of data, to express sophisticated scheduling or automatically assess termination.

BTW, Graphlab is one of the alternatives motivated by the above reason 2.

like image 79
greeness Avatar answered Sep 27 '22 21:09

greeness