I recently wanted to use Cython
with Spark, for which I followed the following reference.
I wrote the following programs as mentioned but I am getting a:
TypeError:
fib_mapper_cython() takes exactly 1 argument (0 given)
def spark_cython(module, method):
def wrapped(*args, **kwargs):
global cython_function_
try:
return cython_function_(*args, **kwargs)
except:
import pyximport
pyximport.install()
cython_function_ = getattr(__import__(module), method)
return cython_function_(*args, **kwargs)
return wrapped()
def fib_mapper_cython(n):
'''
Return the first fibonnaci number > n.
'''
cdef int a = 0
cdef int b = 0
cdef int j = int(n)
while b<j:
a, b = b, a+b
return b, 1
from spark_tools import spark_cython
import pyximport
import os
from pyspark import SparkContext
from pyspark import SparkConf
pyximport.install()
os.environ["SPARK_HOME"] = "/home/spark-1.6.0"
conf = (SparkConf().setMaster('local').setAppName('Fibo'))
sc = SparkContext()
sc.addPyFile('file:///home/Cythonize/fib.pyx')
sc.addPyFile('file:///home/Cythonize/spark_tools.py')
lines = sc.textFile('file:///home/Cythonize/nums.txt')
mapper = spark_cython('fib', 'fib_mapper_cython')
fib_frequency = lines.map(mapper).reduceByKey(lambda a, b: a+b).collect()
print fib_frequency
I get a TypeError
whenever I run the program. Any Ideas?
Fortunately, Cython tools can be conveniently accessed through the Jupyter notebook for a more streamlined and integrated experience. You can launch a notebook session by typing jupyter notebook in the command line and you can load the Cython magic by typing %load_ext cython in a cell.
The Cython compiler will convert it into C code which makes equivalent calls to the Python/C API. But Cython is much more than that, because parameters and variables can be declared to have C data types.
Cython is a static compiler for Python and Cython programming languages, it simplifies the job of writing Python C extensions. Cython allows us to compile Python code, the result is dynamic libraries that can be used as python modules too.
To make your Python into Cython, first you need to create a file with the . pyx extension rather than the . py extension. Inside this file, you can start by writing regular Python code (note that there are some limitations in the Python code accepted by Cython, as clarified in the Cython docs).
This is not a Cython
nor a PySpark
issue, you unfortunately added an extra function call during the definition of spark_cython
. Specifically, the function that wraps the call to the cython_function
is called with no arguments on return:
return wrapped() # call made, no args supplied.
As a result you won't return the wrapped function when you execute this call. What you do is call wrapped
with no *args
or **kwargs
. wrapped
then calls fib_mapper_cython
with no arguments (since *args, **kwargs
are not supplied) hence the TypeError
.
You should instead:
return wrapped
and this issue should no longer be present.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With