Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: How to get the maximum value of a datetime column in a time series data

I am working on a time series data. I have 2 date time columns and 1 fiscal week column. I have given an example where I have a situation like below and I need to get the MAX of the EditDate.

EditDate <- c("2015-04-01 11:40:13", "2015-04-03 02:54:45","2015-04-07 11:40:13")
ID <- c("DL1X8", "DL1X8","DL1X8")
Avg <- c(38.1517, 38.1517, 38.1517)
Sig <- c(11.45880000, 11.45880000, 11.45880000)
InsertDate <- c("2015-04-03 9:40:00", "2015-04-03 9:40:00",2015-04-10 9:40:00)
FW <- c("39","39","40")

df1 <- data.frame(EditDate , ID, Avg, Sig, InsertDate, FW)

This returns

+---------------------+-------+---------+-------------+--------------------+----+
|   EditDate          | ID    | Avg     |   Sig       |    InsertDate      | FW |
+---------------------+-------+---------+-------------+--------------------+----+
| 2015-04-01 11:40:13 | DL1X8 | 38.1517 | 11.45880000 | 2015-04-03 9:40:00 | 39 |
| 2015-04-03 02:54:45 | DL1X8 | 38.1517 | 11.45880000 | 2015-04-03 9:40:00 | 39 |
| 2015-04-07 11:40:13 | DL1X8 | 38.1517 | 11.45880000 | 2015-04-10 9:40:00 | 40 |
+---------------------+-------+---------+-------------+--------------------+----+

The desired output that I want is

+---------------------+-------+---------+-------------+--------------------+----+
|   EditDate          | ID    | Avg     |   Sig       |    InsertDate      | FW |
+---------------------+-------+---------+-------------+--------------------+----+
| 2015-04-07 11:40:13 | DL1X8 | 38.1517 | 11.45880000 | 2015-04-10 9:40:00 | 40 |
+---------------------+-------+---------+-------------+--------------------+----+

I tried using sqldf using the library(RH2) but it takes a lot of time to run.

df2 <- sqldf("SELECT * FROM df1 
                        WHERE (EditDate = (SELECT MAX(EditDate) FROM df1))
                        ORDER BY EditDate ASC")

I am not sure if it could be done using the dplyr package. Could someone provide inputs on how I could optimize this using dplyr or any other alternatives?

like image 211
Sharath Avatar asked Apr 16 '15 21:04

Sharath


People also ask

How do I find the highest value in a column in R?

Maximum value of a column in R can be calculated by using max() function. Max() Function takes column name as argument and calculates the maximum value of that column.

How do I find the max of a list in R?

How to get the max value in a list in R? First, use the unlist() function to convert the list into a vector, and then use the max() function to get the maximum value. The following are the arguments that you can give to the max() function in R. x – The vector for which you want to compute the max value.

What is the max function in R?

which. max() function in R Language is used to return the location of the first maximum value in the Numeric Vector.


3 Answers

Here's one liner with base R

df1[which.max(as.POSIXct(df1$InsertDate)), ]
#              EditDate    ID     Avg     Sig         InsertDate FW
# 3 2015-04-07 11:40:13 DL1X8 38.1517 11.4588 2015-04-10 9:40:00 40

Or with data.table

library(data.table)
setDT(df1)[which.max(as.POSIXct(InsertDate))]
#               EditDate    ID     Avg     Sig         InsertDate FW
# 1: 2015-04-07 11:40:13 DL1X8 38.1517 11.4588 2015-04-10 9:40:00 40
like image 85
David Arenburg Avatar answered Oct 25 '22 12:10

David Arenburg


Just with lubridate

library(lubridate)

df1[ymd_hms(EditDate)==max(ymd_hms(EditDate)), ]

or df1[EditDate==as.character(max(ymd_hms(EditDate))), ]

like image 30
dimitris_ps Avatar answered Oct 25 '22 11:10

dimitris_ps


use libraries data.table and lubridate as following:

 library(data.table)
 library(lubridate)
 setDT(df1)
 df1[,EditDate := ymd_hms(EditDate)]
 res <- df1[EditDate = max(EditDate)]
like image 41
Yevgeny Tkach Avatar answered Oct 25 '22 12:10

Yevgeny Tkach