How to pass data.frame for UPDATE with R DBI

Question

With RODBC, there were functions like sqlUpdate(channel, dat, ...) that allowed you pass dat = data.frame(...) instead of having to construct your own SQL string.

However, with R's DBI, all I see are functions like dbSendQuery(conn, statement, ...) which only take a string statement and gives no opportunity to specify a data.frame directly.

So how to UPDATE using a data.frame with DBI?

R Yoda · Accepted Answer

Really late, my answer, but maybe still helpful...

There is no single function (I know) in the DBI/odbc package but you can replicate the update behavior using a prepared update statement (which should work faster than RODBC's sqlUpdate since it sends the parameter values as a batch to the SQL server:

library(DBI)
library(odbc)

con <- dbConnect(odbc::odbc(), driver="{SQL Server Native Client 11.0}", server="dbserver.domain.com\default,1234", Trusted_Connection = "yes", database = "test")  # assumes Microsoft SQL Server

dbWriteTable(con, "iris", iris, row.names = TRUE)      # create and populate a table (adding the row names as a separate columns used as row ID)

update <- dbSendQuery(con, 'update iris set "Sepal.Length"=?, "Sepal.Width"=?, "Petal.Length"=?, "Petal.Width"=?, "Species"=? WHERE row_names=?')

# create a modified version of `iris`
iris2 <- iris
iris2$Sepal.Length <- 5
iris2$Petal.Width[2] <- 1
iris2$row_names <- rownames(iris)  # use the row names as unique row ID

dbBind(update, iris2)  # send the updated data

dbClearResult(update)  # release the prepared statement

# now read the modified data - you will see the updates did work
data1 <- dbReadTable(con, "iris")

dbDisconnect(con)

This works only if you have a primary key which I created in the above example by using the row names which are a unique number increased by one for each row...

For more information about the odbc package I have used in the DBI dbConnect statement see: https://github.com/rstats-db/odbc

mkirzon · Answer

Building on R Yoda's answer, I made myself the helper function below. This allows using a dataframe to specify update conditions.

While I built this to run transaction updates (i.e. single rows), it can in theory update multiple rows passing a condition. However, that's not the same as updating multiple rows using an input dataframe. Maybe somebody else can build on this...


dbUpdateCustom = function(x, key_cols, con, schema_name, table_name) {
  
  if (nrow(x) != 1) stop("Input dataframe must be exactly 1 row")
  if (!all(key_cols %in% colnames(x))) stop("All columns specified in 'key_cols' must be present in 'x'")
  
  # Build the update string --------------------------------------------------

  df_key     <- dplyr::select(x,  one_of(key_cols))
  df_upt     <- dplyr::select(x, -one_of(key_cols))
  
  set_str    <- purrr::map_chr(colnames(df_upt), ~glue::glue_sql('{`.x`} = {x[[.x]]}', .con = con))
  set_str    <- paste(set_str, collapse = ", ")
  
  where_str  <- purrr::map_chr(colnames(df_key), ~glue::glue_sql("{`.x`} = {x[[.x]]}", .con = con))
  where_str  <- paste(where_str, collapse = " AND ")
  
  update_str <- glue::glue('UPDATE {schema_name}.{table_name} SET {set_str} WHERE {where_str}')
  
  # Execute ------------------------------------------------------------------
  
  query_res <- DBI::dbSendQuery(con, update_str)
  DBI::dbClearResult(query_res)

  return (invisible(TRUE))
}

Where

x: 1-row dataframe that contains 1+ key columns, and 1+ update columns.
key_cols: character vector, of 1 or more column names that are the keys (i.e. used in the WHERE clause)

How to pass data.frame for UPDATE with R DBI

Tags:

database

dataframe

r

rodbc

r-dbi

mchen

2 Answers

R Yoda

mkirzon

Recent Activity

Donate For Us

How to pass data.frame for UPDATE with R DBI

Tags:

database

dataframe

r

rodbc

r-dbi

mchen

2 Answers

R Yoda

mkirzon

Related questions

Recent Activity

Donate For Us