In PySpark it you can define a schema and read data sources with this pre-defined schema, e. g.:
Schema = StructType([ StructField("temperature", DoubleType(), True),
StructField("temperature_unit", StringType(), True),
StructField("humidity", DoubleType(), True),
StructField("humidity_unit", StringType(), True),
StructField("pressure", DoubleType(), True),
StructField("pressure_unit", StringType(), True)
])
For some datasources it is possible to infer the schema from the data-source and get a dataframe with this schema definition.
Is it possible to get the schema definition (in the form described above) from a dataframe, where the data has been inferred before?
df.printSchema()
prints the schema as a tree, but I need to reuse the schema, having it defined as above,so I can read a data-source with this schema that has been inferred before from another data-source.
sql. DataFrame. printSchema() is used to print or display the schema of the DataFrame in the tree format along with column name and data type. If you have DataFrame with a nested structure it displays schema in a nested tree format.
You can find all column names & data types (DataType) of PySpark DataFrame by using df. dtypes and df. schema and you can also retrieve the data type of a specific column name using df. schema["name"].
Yes it is possible. Use DataFrame.schema
property
schema
Returns the schema of this DataFrame as a pyspark.sql.types.StructType.
>>> df.schema StructType(List(StructField(age,IntegerType,true),StructField(name,StringType,true)))
New in version 1.3.
Schema can be also exported to JSON and imported back if needed.
The code below will give you a well formatted tabular schema definition of the known dataframe. Quite useful when you have very huge number of columns & where editing is cumbersome. You can then now apply it to your new dataframe & hand-edit any columns you may want to accordingly.
from pyspark.sql.types import StructType
schema = [i for i in df.schema]
And then from here, you have your new schema:
NewSchema = StructType(schema)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With