Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I write data from R to PostgreSQL tables with an autoincrementing primary key?

Tags:

r

postgresql

I have a table in a PostgreSQL database that has a BIGSERIAL auto-incrementing primary key. Recreate it using:

CREATE TABLE foo
(
  "Id" bigserial PRIMARY KEY,
  "SomeData" text NOT NULL
);

I want to append some data to this table from R via the RPostgreSQL package. In R, the data doesn't include the Id column because I want the database to generate those value.

dfr <- data.frame(SomeData = letters)

Here's the code I used to try and write the data:

library(RPostgreSQL)
conn <- dbConnect(
  "PostgreSQL", 
  user     = "yourname", 
  password = "your password",
  dbname   = "test"
)
dbWriteTable(conn, "foo", dfr, append = TRUE, row.names = FALSE)
dbDisconnect(conn)

Unfortunately, dbWriteTable throws an error:

## Error in postgresqlgetResult(new.con) : 
##   RS-DBI driver: (could not Retrieve the result : ERROR:  invalid input syntax for integer: "a"
## CONTEXT:  COPY foo, line 1, column Id: "a"
## )

The error message isn't completely clear, but I interpret this as R trying to pass the contents of the SomeData column to the first column in the database (which is Id).

How should I be passing the data to PostgreSQL so that the Id column is auto-generated?

like image 646
Richie Cotton Avatar asked Oct 19 '14 10:10

Richie Cotton


People also ask

How do I add a primary key constraint to an existing table in PostgreSQL?

In PostgreSQL, a primary key is created using either a CREATE TABLE statement or an ALTER TABLE statement. You use the ALTER TABLE statement in PostgreSQL to add or drop a primary key.

Does Postgres auto increment primary key?

By simply setting our id column as SERIAL with PRIMARY KEY attached, Postgres will handle all the complicated behind-the-scenes work and automatically increment our id column with a unique, primary key value for every INSERT .

Do Postgres tables need a primary key?

A table with no primary key will only send out INSERTs on the logical decoding stream; UPDATEs and DELETEs are lost. Reading the postgres docs at postgresql.org/docs/10/static/…


1 Answers

From the thread in hrbrmstr's comment, I found a hack to make this work.

In the postgresqlWriteTable in the RPostgreSQL package, you need to replace the line

sql4 <- paste("COPY", postgresqlTableRef(name), "FROM STDIN")

with

sql4 <- paste(
  "COPY ", 
  postgresqlTableRef(name), 
  "(", 
  paste(postgresqlQuoteId(names(value)), collapse = ","), 
  ") FROM STDIN"
)

Note that the quoting of variables (not included in the original hack) is necessary to pass case-sensitive column names.

Here's a script to do that:

body_lines <- deparse(body(RPostgreSQL::postgresqlWriteTable))
new_body_lines <- sub(
  'postgresqlTableRef(name), "FROM STDIN")', 
  'postgresqlTableRef(name), "(", paste(shQuote(names(value)), collapse = ","), ") FROM STDIN")', 
  body_lines,
  fixed = TRUE
)
fn <- RPostgreSQL::postgresqlWriteTable
body(fn) <- parse(text = new_body_lines)
while("RPostgreSQL" %in% search()) detach("package:RPostgreSQL")
assignInNamespace("postgresqlWriteTable", fn, "RPostgreSQL")
like image 123
Richie Cotton Avatar answered Oct 17 '22 07:10

Richie Cotton