All of these dates that I’ve manipulated in Execute R module in Azure Machine Learning write out as blank in the output – that is, these date columns exist, but there is no value in those columns.
The source variables which contain date information that I’m reading into the data frame have two different date formats. They are as follows:
usage$Date1=c(‘8/6/2015’ ‘8/20/2015’ ‘7/9/2015’)
usage$Date2=c(‘4/16/2015 0:00’, ‘7/1/2015 0:00’, ‘7/1/2015 0:00’)
I inspected the log file in AML, and AML can't find the local time zone. The log file warnings specifically: [ModuleOutput] 1: In strptime(x, format, tz = tz) : [ModuleOutput] unable to identify current timezone 'C': [ModuleOutput] please set environment variable 'TZ' [ModuleOutput] [ModuleOutput] 2: In strptime(x, format, tz = tz) : unknown timezone 'localtime'
I referred to another answer regarding setting default time zone for strptime here
unknown timezone name in R strptime/as.POSIXct
I changed my code to explicitly define the global environment time variable.
Sys.setenv(TZ='GMT')
####Data frame usage cleanup, format and labeling
usage<-as.data.frame(usage)
usage$Date1<-as.character(usage$Date1)
usage$Date1<-as.POSIXct(usage$Date1, "%m/%d/%Y",tz="GMT")
usage$Date1<-format(usage$Date1, "%m/%d/%Y")
usage$Date1<-as.Date(usage$Date1, "%m/%d/%Y")
usage<-as.data.frame(usage)
usage$Date2<- as.POSIXct(usage$Date2, "%m/%d/%Y",tz="GMT")
usage$Date2<- format(usage$Date2,"%m/%d/%Y")
usage$Date2<-as.Date(usage$Date2, "%m/%d/%Y")
usage<-as.data.frame(usage)
The problem persists -as a result AzureML does not write these variables out, rather writing out these columns as blanks.
(This code works in R studio, where I presume the local time is taken from the system.)
After reading two blog posts on this problem, it seems that Azure ML doesn't support some date time formats:
http://blogs.msdn.com/b/andreasderuiter/archive/2015/02/03/troubleshooting-error-1000-rpackage-library-exception-failed-to-convert-robject-to-dataset-when-running-r-scripts-in-azure-ml.aspx
http://www.mikelanzetta.com/2015/01/data-cleaning-with-azureml-and-r-dates/
So I tried to convert to POSIXct before sending it to the output stream, which I've done as follows: tenantusage$Date1 = as.POSIXct(tenantusage$Date1 , "%m/%d/%Y",tz = "EST5EDT"); tenantusage$Date2 = as.POSIXct(tenantusage$Date2 , "%m/%d/%Y",tz = "EST5EDT");
But encounter the same problem. The information in these variables refuses to write out to the output. Date1 and Date2 columns are blank.
Please advise!
thanks
Hi SingingData and SochiX,
Sorry to hear about this source of frustration! I find that the following variation on SingingData's code sample works for me (tested in a CRAN 3.1.0 module):
usage <- data.frame(list(Date1 = c('8/6/2015', '8/20/2015', '7/9/2015'),
Date2 = c('4/16/2015 0:00', '7/1/2015 0:00', '7/1/2015 0:00')))
usage$Date1 <- as.POSIXlt(usage$Date1, "%m/%d/%Y",tz="GMT")
usage$Date2 <- as.POSIXlt(usage$Date2, "%m/%d/%Y",tz="GMT")
usage$Date1 <- format(usage$Date1, "%m/%d/%Y")
usage$Date2 <- format(usage$Date2,"%m/%d/%Y")
usage$Date1 <- as.Date(usage$Date1, "%m/%d/%Y")
usage$Date2 <- as.Date(usage$Date2, "%m/%d/%Y")
maml.mapOutputPort("usage");
I've used as.POSIXlt()
instead of as.POSIXct()
. I hope that this helps unblock your work in R.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With