Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get the last row from DataFrame?

I hava a DataFrame,the DataFrame hava two column 'value' and 'timestamp',,the 'timestmp' is ordered,I want to get the last row of the DataFrame,what should I do?

this is my input:

+-----+---------+
|value|timestamp|
+-----+---------+
|    1|        1|
|    4|        2|
|    3|        3|
|    2|        4|
|    5|        5|
|    7|        6|
|    3|        7|
|    5|        8|
|    4|        9|
|   18|       10|
+-----+---------+

this is my code:

    val arr = Array((1,1),(4,2),(3,3),(2,4),(5,5),(7,6),(3,7),(5,8),(4,9),(18,10))
    var df=m_sparkCtx.parallelize(arr).toDF("value","timestamp")

this is my expected result:

+-----+---------+
|value|timestamp|
+-----+---------+
|   18|       10|
+-----+---------+
like image 537
mentongwu Avatar asked Jul 31 '17 02:07

mentongwu


People also ask

How do you find the last value of a data frame?

iloc – Pandas Dataframe. iloc is used to retrieve data by specifying its index. In python negative index starts from the end so we can access the last element of the dataframe by specifying its index to -1.

How can you display the last 5 rows of the Dataframe?

Method 1: Using tail() method DataFrame. tail(n) to get the last n rows of the DataFrame. It takes one optional argument n (number of rows you want to get from the end). By default n = 5, it return the last 5 rows if the value of n is not passed to the method.

How do you get the last row in PySpark Dataframe?

Use tail() action to get the Last N rows from a DataFrame, this returns a list of class Row for PySpark and Array[Row] for Spark with Scala.

How do I get the last item in the pandas series?

Pandas iloc is used to retrieve data by specifying its integer index. In python negative index starts from end therefore we can access the last element by specifying index to -1 instead of length-1 which will yield the same result.


2 Answers

Try this, it works for me.

df.orderBy($"value".desc).show(1)
like image 162
Mimi Cheng Avatar answered Oct 20 '22 19:10

Mimi Cheng


I would use simply the query that - orders your table by descending order - takes 1st value from this order

df.createOrReplaceTempView("table_df")
query_latest_rec = """SELECT * FROM table_df ORDER BY value DESC limit 1"""
latest_rec = self.sqlContext.sql(query_latest_rec)
latest_rec.show()
like image 45
Danylo Zherebetskyy Avatar answered Oct 20 '22 19:10

Danylo Zherebetskyy