Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to run Python's scikit-learn algorithms over Hadoop? [closed]

I know it is possible to use python language over Hadoop.

But is it possible to use scikit-learn's machine learning algorithms on Hadoop ?

If the answer is no, is there some machine learning library for python and Hadoop ?

Thanks for your Help.

like image 223
shanks_roux Avatar asked Feb 17 '14 10:02

shanks_roux


People also ask

Which is better TensorFlow or scikit-learn?

Scikit-learn and TensorFlow were designed to assist developers in creating and benchmarking new models, so their functional implementations are very similar, with the exception that Scikit-learn is used in practice with a broader range of models, whereas TensorFlow's implied use is for neural networks.

What are requirements for working with data in scikit-learn?

The scikit learn library has the following requirements for the data before it can be used to train a model: Features and response should be separate objects. Features and response should be numeric. Features and response should be NumPy arrays of compatible sizes (in terms of rows and columns)

What is scikit-learn what are the algorithms can be supported by Scikit?

Scikit-learn is a free machine learning library for Python. It features various algorithms like support vector machine, random forests, and k-neighbours, and it also supports Python numerical and scientific libraries like NumPy and SciPy .

Can I use scikit-learn with spark?

When used on a single machine, Spark can be used as a substitute to the default multithreading framework used by scikit-learn. If a need comes to spread the work across multiple machines, no change is required in the code between the single-machine case and the cluster case.


1 Answers

Short answer: YES. Because you can run almost everything on Hadoop.

Long answer: it depends. Answer to this question for a start:

  • Can you split your dataset into partitions?

Also, you may find this presentation useful (Hadoop is starting at 73'rd slide).

like image 130
Viacheslav Rodionov Avatar answered Oct 16 '22 04:10

Viacheslav Rodionov