How do I resolve this issue?
rdd.collect() //['3e866d48b59e8ac8aece79597df9fb4c'...]
rdd.toDF() //Can not infer schema for type: <type 'str'>
myschema=StructType([StructField("col1", StringType(),True)])
rdd.toDF(myschema).show()
// StructType can not accept object "3e866d48b59e8ac8aece79597df9fb4c" in type
It seems you have:
rdd = sc.parallelize(['3e866d48b59e8ac8aece79597df9fb4c'])
Which is a one dimensional data structure, a data frame is 2d; map
each number to a tuple solves the problem:
rdd.map(lambda x: (x,)).toDF().show()
+--------------------+
| _1|
+--------------------+
|3e866d48b59e8ac8a...|
+--------------------+
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With