Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

NumPy exception when using MLlib even though Numpy is installed

Here's the code I'm trying to execute:

from pyspark.mllib.recommendation import ALS
iterations=5
lambdaALS=0.1
seed=5L
rank=8
model=ALS.train(trainingRDD,rank,iterations, lambda_=lambdaALS, seed=seed)

When I run the model=ALS.train(trainingRDD,rank,iterations, lambda_=lambdaALS, seed=seed) command that depends on numpy, the Py4Java library that Spark uses throws the following message:

Py4JJavaError: An error occurred while calling o587.trainALSModel.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 67.0 failed 4 times, most recent failure: Lost task 0.3 in stage 67.0 (TID 195, 192.168.161.55): org.apache.spark.api.python.PythonException: Traceback (most recent call last):
  File "/home/platform/spark/python/lib/pyspark.zip/pyspark/worker.py", line 98, in main
    command = pickleSer._read_with_length(infile)
  File "/home/platform/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 164, in _read_with_length
    return self.loads(obj)
  File "/home/platform/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 421, in loads
    return pickle.loads(obj)
  File "/home/platform/spark/python/lib/pyspark.zip/pyspark/mllib/__init__.py", line 27, in <module>
Exception: MLlib requires NumPy 1.4+

NumPy 1.10 is installed on the machine stated in the error message. Moreover I get version 1.9.2 when executing the following command directly in my Jupyter notebook: import numpy numpy.version.version

I am obviously running a version of NumPy older than 1.4 but I don't know where. How can I tell on which machine do I need to update my version of NumPy?

like image 587
pelicanactor Avatar asked Oct 09 '15 19:10

pelicanactor


1 Answers

It is a bug in Mllib init code

import numpy
if numpy.version.version < '1.4':
    raise Exception("MLlib requires NumPy 1.4+")

'1.10' is < from '1.4' You can use NumPy 1.9.2 .

If you have to use NumPy 1.10 and don't want to upgrade to spark 1.5.1 . Do a manual update to the code. https://github.com/apache/spark/blob/master/python/pyspark/mllib/init.py

like image 152
RanP Avatar answered Oct 17 '22 16:10

RanP