Using the max () method, we can get the maximum value from the column, and finally, we can use the collect() method to get the maximum from the column. Where, df is the input PySpark DataFrame. column_name is the column to get the maximum value.
The PySpark SQL Aggregate functions are further grouped as the “agg_funcs” in the Pyspark. The Kurtosis() function returns the kurtosis of the values present in the group. The min() function returns the minimum value currently in the column. The max() function returns the maximum value present in the queue.
The dataframe has a date column in string type '2017-01-01'
It is converted to DateType()
df = df.withColumn('date', col('date_string').cast(DateType()))
I would like to calculate the first day
and last day
of the column. I tried with the following codes, but they do not work. Can anyone give any suggestions? Thanks!
df.select('date').min()
df.select('date').max()
df.select('date').last_day()
df.select('date').first_day()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With