I am trying to replicate a deep learning project from https://medium.com/linagora-engineering/making-image-classification-simple-with-spark-deep-learning-f654a8b876b8 . I am working on spark version 1.6.3. I have installed keras and tensorflow. But everytime i try to import from sparkdl it throws an error. I am working on Pyspark. When I run this:-
from sparkdl import readImages
I get this error:-
File "C:\Users\HP\AppData\Local\Temp\spark-802a2258-3089-4ad7-b8cb-
6815cbbb019a\userFiles-c9514201-07fa-45f9-9fd8-
c8a3a0b4bf70\databricks_spark-deep-learning-0.1.0-spark2.1-
s_2.11.jar\sparkdl\transformers\keras_image.py", line 20, in <module>
ImportError: cannot import name 'TypeConverters'
Can someone pls help?
Its not a full fix, as i have yet to be able to import things from sparkdl in jupyter notebooks aswell, but!
readImages is a function in pyspark.ml.image
package
so to import it you need to:
from pyspark.ml.image import ImageSchema
to use it:
imagesDF = ImageSchema.readImages("/path/to/imageFolder")
This will give you a dataframe of the images, with column "image"
You can add a label column as such:
labledImageDF = imagesDF.withColumn("label", lit(0))
but remember to import functions
from pyspark.sql
to use lit
function
from pyspark.sql.functions import *
Hope this at least partially helps
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With