I'm trying to create empty struct column in pyspark. For array this works
import pyspark.sql.functions as F
df = df.withColumn('newCol', F.array([]))
but this gives me an error.
df = df.withColumn('newCol', F.struct())
I saw similar question but for scala not pyspark so it doesn't really help me.
Actually the array is not really empty, because it has an empty element. You should instead consider something like this:
df = df.withColumn('newCol', F.lit(None).cast(T.StructType())
PS: it's a late conversion of my comment into an answer, as it has been proposed - I hope it will help even if it's late after the OP's question
If you know the schema of the struct column, you can use the function from_json as follows
struct_schema = StructType([
StructField('name', StringType(), False),
StructField('surname', StringType(), False),
])
df = df.withColumn(
'newCol', F.from_json(psf.lit(""), struct_schema)
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With