Add one more StructField to schema

Question

My PySpark data frame has the following schema:

schema = spark_df.printSchema()

root
 |-- field_1: double (nullable = true)
 |-- field_2: double (nullable = true)
 |-- field_3 (nullable = true)
 |-- field_4: double (nullable = true)
 |-- field_5: double (nullable = true)
 |-- field_6: double (nullable = true)

I would like to add one more StructField to the schema, so the new schema would looks like:

root
 |-- field_1: double (nullable = true)
 |-- field_1: double (nullable = true)
 |-- field_2: double (nullable = true)
 |-- field_3 (nullable = true)
 |-- field_4: double (nullable = true)
 |-- field_5: double (nullable = true)
 |-- field_6: double (nullable = true)

I know I can manually create a new_schema like below:

new_schema = StructType([StructField("field_0", StringType(), True),
                            :
                         StructField("field_6", IntegerType(), True)])

This works for a small number of fields but couldn't generate if I have hundreds of fields. So I am wondering is there a more elegant way to add a new field to the beginning of the schema? Thanks!

zero323 · Accepted Answer

You can copy existing fields and perpend:

to_prepend = [StructField("field_0", StringType(), True)] 

StructType(to_prepend + df.schema.fields)

Add one more StructField to schema

Tags:

python

apache-spark

apache-spark-sql

pyspark

Edamame

1 Answers

zero323

Recent Activity

Donate For Us

Add one more StructField to schema

Tags:

python

apache-spark

apache-spark-sql

pyspark

Edamame

1 Answers

zero323

Related questions

Recent Activity

Donate For Us