I'm new in pyspark . I write this code in pyspark:
def filterOut2(line):
return [x for x in line if x != 2]
filtered_lists = data.map(filterOut2)
but I get this error:
'list' object has no attribute 'map'
How do I perform a map
operation specifically on my data in PySpark in a way that allows me to filter my data to only those values for which my condition evaluates to true?
The Python "AttributeError: 'list' object has no attribute" occurs when we access an attribute that doesn't exist on a list. To solve the error, access the list element at a specific index or correct the assignment.
It's simply because there is no attribute with the name you called, for that Object. This means that you got the error when the "module" does not contain the method you are calling.
map(filterOut2, data)
works:
>>> data = [[1,2,3,5],[1,2,5,2],[3,5,2,8],[6,3,1,2],[5,3,2,5],[4,1,2,5] ]
... def filterOut2(line):
... return [x for x in line if x != 2]
... list(map(filterOut2, data))
...
[[1, 3, 5], [1, 5], [3, 5, 8], [6, 3, 1], [5, 3, 5], [4, 1, 5]]
map() takes exactly 1 argument (2 given)
Looks like you redefined map
. Try __builtin__.map(filterOut2, data)
.
Or, use a list comprehension:
>>> [filterOut2(line) for line in data]
[[1, 3, 5], [1, 5], [3, 5, 8], [6, 3, 1], [5, 3, 5], [4, 1, 5]]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With