Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

dealing with the datetime value in R

First of all, I have a large data.table with the one parameter-Date, but the str(Date) is chr.

date                
2015-07-01 0:15:00 
2015-07-01 0:30:00 
2015-07-01 0:45:00 
2015-07-01 0:60:00 
2015-07-01 1:15:00 
2015-07-01 1:30:00 
2015-07-01 1:45:00 
2015-07-01 1:60:00 

what i want to do is

  • make them in standard format like: 2015-07-01 00:15:00
  • correct the time, for example: 2015-07-01 1:60:00 -> 2015-07-01 02:00:00

for the first one, I tried to use the function as.POSIXct() to reset the format, it should be correct, but the problem is for the data like 2015-07-01 1:60:00, after transformatiion, it is just NA.

anybody has ideas?

Here is a code to generate test data:

dd <- data.table(date = c("2015-07-01 0:15:00", "2015-07-01 0:30:00", 
    "2015-07-01 0:45:00","2015-07-01 0:60:00", "2015-07-01 1:15:00",
    "2015-07-01 1:30:00","2015-07-01 1:45:00","2015-07-01 1:60:00","2015-07-01 2:15:00"))

Note: this table is just for one day and the last value of the table is

2015-07-01 23:60:00

for any unclear points, feel free to let me know thanks for that !

like image 297
ZAWD Avatar asked Aug 25 '16 09:08

ZAWD


1 Answers

In base R you could try this:

df1$date <- gsub(":60:",":59:",df1$date, fixed = TRUE)    
df1$date <- as.POSIXct(df1$date)
the59s <- grepl(":59:",df1$date)
df1$date[the59s] <- df1$date[the59s] + 60
#> df1
#                 date
#1 2015-07-01 00:15:00
#2 2015-07-01 00:30:00
#3 2015-07-01 00:45:00
#4 2015-07-01 01:00:00
#5 2015-07-01 01:15:00
#6 2015-07-01 01:30:00
#7 2015-07-01 01:45:00
#8 2015-07-01 02:00:00
#9 2015-07-01 02:15:00

The idea is to let POSIXct perform the conversion to the next hour / day / month / ... triggered by a "60 minutes" value. For this we first identify those entries containing :60: and replace that part with :59:. Then the column is converted into a POSIXct object. Afterwards we find all those entries containing a ":59:" and add 60 (seconds), thereby converting the time/date to the intended format.


In the case described by the OP the data contains only quarter hour values 0, 15, 30, 40, 60. A more general situation may include genuine 59 minutes values that should not be converted to the next hour. It would then be better to store the relevant row indices before performing the conversion:

the60s <- grepl(":60:", df1$date)
df1$date <- gsub(":60:",":59:",df1$date, fixed = TRUE)    
df1$date <- as.POSIXct(df1$date)
df1$date[the60s] <- df1$date[the60s] + 60

data:

df1 <- structure(list(date = structure(1:9, .Label = c("2015-07-01 0:15:00", 
 "2015-07-01 0:30:00", "2015-07-01 0:45:00", "2015-07-01 0:60:00", 
 "2015-07-01 1:15:00", "2015-07-01 1:30:00", "2015-07-01 1:45:00", 
 "2015-07-01 1:60:00", "2015-07-01 2:15:00"), class = "factor")), 
 .Names = "date", row.names = c(NA, -9L), class = "data.frame")
like image 162
RHertel Avatar answered Oct 05 '22 08:10

RHertel