I have a data frame in pyspark
. In this data frame I have column called id
that is unique.
Now I want to find the maximum
value of the column id
in the data frame.
I have tried like below
df['id'].max()
But got below error
TypeError: 'Column' object is not callable
Please let me know how to find the maximum
value of a column in data frame
In the answer by @Dadep the link gives the correct answer
if you are using pandas .max()
will work :
>>> df2=pd.DataFrame({'A':[1,5,0], 'B':[3, 5, 6]})
>>> df2['A'].max()
5
Else if it's a spark
dataframe:
Best way to get the max value in a Spark dataframe column
I'm coming from scala, but I do believe that this is also applicable on python.
val max = df.select(max("id")).first()
but you have first import the following :
from pyspark.sql.functions import max
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With