I'm reading a dataframe from parquet file, which has nested columns (struct
).
How can I check if nested columns are present?
It might be like this
+----------------------+
| column1 |
+----------------------+
|{a_id:[1], b_id:[1,2]}|
+----------------------+
or like this
+---------------------+
| column1 |
+---------------------+
|{a_id:[3,5]} |
+---------------------+
I know, how to check if top-level column is present, as answered here: How do I detect if a Spark DataFrame has a column :
df.schema.fieldNames.contains("column_name")
But how can I check for nested column?
You can get schema of nested field as struct, and then check if your field is present in field names of it:
val index = df.schema.fieldIndex("column1")
val is_b_id_present = df.schema(index).dataType.asInstanceOf[StructType]
.fieldNames.contains("b_id")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With