I have a dataframe with the following schema:
root
 |-- _id: long (nullable = true)
 |-- student_info: struct (nullable = true)
 |    |-- firstname: string (nullable = true)
 |    |-- lastname: string (nullable = true)
 |    |-- major: string (nullable = true)
 |    |-- hounour_roll: boolean (nullable = true)
 |-- school_name: string (nullable = true)
How can I get a list of columns under "student_info" only? I.e. ["firstname","lastname","major","honour_roll"]
All of the following return the list of struct's field names. The .columns approach looks cleanest.
df.select("student_info.*").columns
df.schema["student_info"].dataType.names
df.schema["student_info"].dataType.fieldNames()
df.select("student_info.*").schema.names
df.select("student_info.*").schema.fieldNames()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With