I'm trying to pull data from a PostgreSQL database and the results for a timestamp field are inconsistent. I'm not sure if I'm handling POSIXct results properly. Otherwise, I think I found a bug in the RPostgreSQL package. Here is the way to replicate the issue:
Suppose there is a table in a postgres database with one field (run this in PostgreSQL):
CREATE DATABASE mydb;
CREATE TABLE test_table
(
"DateTime" timestamp without time zone NOT NULL,
CONSTRAINT "pk_test_table" PRIMARY KEY ("DateTime")
)
WITH (
OIDS=FALSE
);
ALTER TABLE test_table
OWNER TO postgres;
And let’s say there are a few hundred records. I will populate them in R. Here is the code:
library(RPostgreSQL)
# Let's feed the table with some sequence of date/time values
date_values <- as.chron(seq(10000, 10500, 1/24))
format.chron <- function(z) {
sprintf("%04.0f-%02.0f-%02.0f %02.0f:%02.0f:00",
as.numeric(as.character(years(z))),
months(z),
as.numeric(as.character(days(z))),
as.numeric(as.character(hours(z))),
as.numeric(as.character(minutes(z))))
}
.generateInsertQuery <- function(date_values, field_name, table_name) {
insert_val <- paste(paste0("(", sQuote(format(date_values)), ")"), collapse=',')
qry <- paste("INSERT INTO", dQuote(table_name), paste0("(", dQuote(field_name), ")"), "VALUES", insert_val)
qry
}
drv <- dbDriver('PostgreSQL')
con <- dbConnect(drv, user='postgres', dbname='mydb')
qry <- .generateInsertQuery(date_values, "DateTime", "test_table")
dbSendQuery(con, qry)
If I try to get the values, the time component gets stripped out of the resulting data
res <- dbGetQuery(con, "SELECT * FROM test_table")
res[1:20,1]
The class of the result, however, is POSIXct
class(res[,1])
If the result is fetched one record at a time, the values with hour:min equal to 00:00 loose the time component:
rs <- dbSendQuery(con, "SELECT \"DateTime\" FROM test_table")
res_list <- list()
for(i in 1:100) res_list[i] <- fetch(rs,1)
res_list
As a workaround, I'm fetching the result 1 record at a time, fixing, and aggregating them into a data.frame. But this is very time-consuming, especially for large data sets. Any ideas of why this is happening and how to deal with this issue?
Thanks in advance!
The TIMESTAMP (also known as TIMESTAMP WITHOUT TIME ZONE ) and TIMESTAMPTZ (also known as TIMESTAMP WITH TIME ZONE ) types stored as a 64-bit integer as a microsecond offset since 1970-01-01 in CRDB and as a 64-bit integer microsecond offset since 2000-01-01 in PostgreSQL (by default).
PostgreSQL supports the full set of SQL date and time types, shown in Table 8.9.
The TO_DATE function in PostgreSQL is used to converting strings into dates. Its syntax is TO_DATE(text, text) and the return type is date. The TO_TIMESTAMP function converts string data into timestamps with timezone. Its syntax is to_timestamp(text, text) .
You can specify double colons (::) to cast a DATETIME value to a DATE value. You can combine TO_CHAR() and the Now() function to convert the current timestamp into the specified format.
RPostgreSQL's dbWriteTable
with any posixct field will create database field of type timestamp with timezone
always with tz +00
no matter what posixct timezone it will be. I believe more precise would be to create timestamp without timezone
instead.
Best solution for both dbReadTable
and dbWriteTable
is to use Sys.setenv(TZ = "UTC")
.
In my opinion it is too deep dependency because many other processes in R session may require proper timezone setting.
Much more specific and not to deep dependent is to define own dbReadTable
and dbWriteTable
which wraps the DBI
version with appropriate preprocess/postprocess of posixct types. But still it is not an option if you are developing DBI-compliant code/package (not only postgres related).
It would be great to have RPostgreSQL migrated to github for easier contribution.
First off, the RPostgreSQL project has a mailing list; I suggest you post there.
PostgreSQL has two datetime types: with and without timezone. As I recall, R only maps the latter. I did write some early regression tests for this (see the package source) but have not been that involved with the project of late. But I do recall that POSIXct maps back and forth to the PostgreSQL datetime type just fine.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With