Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert a character date time to be useable using dplyr and RPostgreSQL?

I have time stamp, column Timelocal in my data that's formatted as follows:

2015-08-24T00:02:03.000Z

Normally, I use the following line to convert this format to convert it to a date format I can use.

timestamp2 = "2015-08-24T00:02:03.000Z"
timestamp2_formatted = strptime(timestamp2,"%Y-%m-%dT%H:%M:%S",tz="UTC") 
    # also works for dataframes (my main use of it)
        df$TimeNew = strptime(df$TimeLocal,"%Y-%m-%dT%H:%M:%S",tz="UTC")

This works fine on my machine. The problem is, I'm now working with a much bigger dataframe. It's on a Redshift cluster and I am accessing it using the RPostgreSQL package. I'm using dplyr to manipulate data as the documentation online indicates that it plays nicely with RPostgreSQL.

It does seem to, except for converting the date format. I'd like to convert the character format to a time format. Timelocal it was read into Redshift as "varchar". Thus, R is interpreting it as a character field.

I've tried the following:

library(dplyr)
library(RPostgreSQL)
library(lubridate)

try 1 - using easy dplyr syntax

mutate(elevate, timelocalnew = fast_strptime(timelocal, "%Y-%m-%dT%H:%M:%S",tz="UTC")) 

try 2 - using dplyr syntax from another online reference code

elevate %>% 
  mutate(timelocalnew = timelocal %>% fast_strptime("%Y-%m-%dT%H:%M:%S",tz="UTC") %>% as.character()) %>%
  filter(!is.na(timelocalnew))

try 3 - using strptime instead of fast_strptime

elevate %>% 
  mutate(timelocalnew = timelocal %>% strptime("%Y-%m-%dT%H:%M:%S",tz="UTC") %>% as.character()) %>%
  filter(!is.na(timelocalnew))

I am trying to adapt code from here: http://www.markhneedham.com/blog/2014/12/08/r-dplyr-mutate-with-strptime-incompatible-sizewrong-result-size/

My tries are erroring because:

Error in postgresqlExecStatement(conn, statement, ...) : 
  RS-DBI driver: (could not Retrieve the result : ERROR:  syntax error at or near "AS"
LINE 1: ...CAST(STRPTIME("timelocal", '%YSuccess2048568264T%H%M�����', 'UTC' AS "tz") A...
                                                             ^
)
In addition: Warning messages:
1: In postgresqlQuickSQL(conn, statement, ...) :
  Could not create executeSELECT count(*) FROM (SELECT "timelocal", "timeutc", "zipcode", "otherdata", "country", CAST(STRPTIME("timelocal", '%Y%m%dT%H%M%S', 'UTC' AS "tz") AS TEXT) AS "timelocalnew"
FROM "data") AS "master"
2: Named arguments ignored for SQL STRPTIME 

It would seem that strptime is incompatible with RPostgreSQL. Is this the right interpretation? If so, does this mean there is no means of handling date formats within R if the data is on Redshift? I checked the RPostgreSQL package documentation and did not see anything related to specifying time formats.

Would appreciate any advice on getting date time columns formatted correctly with dplyr and RpostgreSQL.

like image 893
leaRningR909 Avatar asked Jan 25 '16 17:01

leaRningR909


People also ask

How do I convert character data to time in R?

We can convert the character to timestamp by using strptime() method. strptime() function in R Language is used to parse the given representation of date and time with the given template.

How to convert character date to date in R?

You can use the as. Date( ) function to convert character data to dates. The format is as. Date(x, "format"), where x is the character data and format gives the appropriate format.

How to display date in R?

Importing Dates from Character Format For example, “05/27/84” is in the format %m/%d/%y, while “May 27 1984” is in the format %B %d %Y. This outputs the dates in the ISO 8601 international standard format %Y-%m-%d. If you would like to use dates in a different format, read “Changing Date Formats” below.

How to create a date in R?

To create a Date object from a simple character string in R, you can use the as. Date() function. The character string has to obey a format that can be defined using a set of symbols (the examples correspond to 13 January, 1982): %Y : 4-digit year (1982)


1 Answers

Traditional R functions will not work here.
Your should go with SQL translation which has been evolving in the latest versions of dplyr and dbplyr.
The following worked for me:

library(dbplyr)
mutate(date = to_date(timestamp2, 'YYYY-MM-DD'))  

Note, I am using AWS Redshift.

like image 195
elmaroto10 Avatar answered Oct 16 '22 06:10

elmaroto10