Here are the functions in my class:
def labeling(self, value, labelMap, dtype='string'):
if dtype.value == 'string':
result = [i for v,i in labelMap.value if value==v][0]
return result
else:
result = [i for v,i in labelMap.value if value<v][0]
return result
def labelByValue(self, labelMap, dtype='string'):
labeling = self.labeling
labelMap = self.sc.broadcast(labelMap)
dtype = self.sc.broadcast(dtype)
self.RDD = self.RDD.map(labeling)
but when I call the function below in "main", it report error like:""It appears that you are attempting to reference SparkContext from a broadcast ""
class.RDD.labelByValue((('a', 1), ('b', 2), ('c', 3)))
I could not find anything by myself. So I came here for a help Thanks in advance.
I finally finished this error.
The wrong point is that the user defined function should be put in global environment, not in the class.
So the labeling should be like this:
def labeling(value, labelMap, dtype='string'):
if dtype.value == 'string':
result = [i for v,i in labelMap.value if value==v][0]
return result
else:
result = [i for v,i in labelMap.value if value<v][0]
return result
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With