I have time stamp, column Timelocal
in my data that's formatted as follows:
2015-08-24T00:02:03.000Z
Normally, I use the following line to convert this format to convert it to a date format I can use.
timestamp2 = "2015-08-24T00:02:03.000Z"
timestamp2_formatted = strptime(timestamp2,"%Y-%m-%dT%H:%M:%S",tz="UTC")
# also works for dataframes (my main use of it)
df$TimeNew = strptime(df$TimeLocal,"%Y-%m-%dT%H:%M:%S",tz="UTC")
This works fine on my machine. The problem is, I'm now working with a much bigger dataframe. It's on a Redshift cluster and I am accessing it using the RPostgreSQL package. I'm using dplyr to manipulate data as the documentation online indicates that it plays nicely with RPostgreSQL.
It does seem to, except for converting the date format. I'd like to convert the character format to a time format. Timelocal it was read into Redshift as "varchar". Thus, R is interpreting it as a character field.
I've tried the following:
library(dplyr)
library(RPostgreSQL)
library(lubridate)
try 1 - using easy dplyr syntax
mutate(elevate, timelocalnew = fast_strptime(timelocal, "%Y-%m-%dT%H:%M:%S",tz="UTC"))
try 2 - using dplyr syntax from another online reference code
elevate %>%
mutate(timelocalnew = timelocal %>% fast_strptime("%Y-%m-%dT%H:%M:%S",tz="UTC") %>% as.character()) %>%
filter(!is.na(timelocalnew))
try 3 - using strptime instead of fast_strptime
elevate %>%
mutate(timelocalnew = timelocal %>% strptime("%Y-%m-%dT%H:%M:%S",tz="UTC") %>% as.character()) %>%
filter(!is.na(timelocalnew))
I am trying to adapt code from here: http://www.markhneedham.com/blog/2014/12/08/r-dplyr-mutate-with-strptime-incompatible-sizewrong-result-size/
My tries are erroring because:
Error in postgresqlExecStatement(conn, statement, ...) :
RS-DBI driver: (could not Retrieve the result : ERROR: syntax error at or near "AS"
LINE 1: ...CAST(STRPTIME("timelocal", '%YSuccess2048568264T%H%M�����', 'UTC' AS "tz") A...
^
)
In addition: Warning messages:
1: In postgresqlQuickSQL(conn, statement, ...) :
Could not create executeSELECT count(*) FROM (SELECT "timelocal", "timeutc", "zipcode", "otherdata", "country", CAST(STRPTIME("timelocal", '%Y%m%dT%H%M%S', 'UTC' AS "tz") AS TEXT) AS "timelocalnew"
FROM "data") AS "master"
2: Named arguments ignored for SQL STRPTIME
It would seem that strptime is incompatible with RPostgreSQL. Is this the right interpretation? If so, does this mean there is no means of handling date formats within R if the data is on Redshift? I checked the RPostgreSQL package documentation and did not see anything related to specifying time formats.
Would appreciate any advice on getting date time columns formatted correctly with dplyr and RpostgreSQL.
We can convert the character to timestamp by using strptime() method. strptime() function in R Language is used to parse the given representation of date and time with the given template.
You can use the as. Date( ) function to convert character data to dates. The format is as. Date(x, "format"), where x is the character data and format gives the appropriate format.
Importing Dates from Character Format For example, “05/27/84” is in the format %m/%d/%y, while “May 27 1984” is in the format %B %d %Y. This outputs the dates in the ISO 8601 international standard format %Y-%m-%d. If you would like to use dates in a different format, read “Changing Date Formats” below.
To create a Date object from a simple character string in R, you can use the as. Date() function. The character string has to obey a format that can be defined using a set of symbols (the examples correspond to 13 January, 1982): %Y : 4-digit year (1982)
Traditional R functions will not work here.
Your should go with SQL translation which has been evolving in the latest versions of dplyr
and dbplyr
.
The following worked for me:
library(dbplyr)
mutate(date = to_date(timestamp2, 'YYYY-MM-DD'))
Note, I am using AWS Redshift.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With