Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to pass data.frame for UPDATE with R DBI

With RODBC, there were functions like sqlUpdate(channel, dat, ...) that allowed you pass dat = data.frame(...) instead of having to construct your own SQL string.

However, with R's DBI, all I see are functions like dbSendQuery(conn, statement, ...) which only take a string statement and gives no opportunity to specify a data.frame directly.

So how to UPDATE using a data.frame with DBI?

like image 566
mchen Avatar asked Dec 12 '13 14:12

mchen


2 Answers

Really late, my answer, but maybe still helpful...

There is no single function (I know) in the DBI/odbc package but you can replicate the update behavior using a prepared update statement (which should work faster than RODBC's sqlUpdate since it sends the parameter values as a batch to the SQL server:

library(DBI)
library(odbc)

con <- dbConnect(odbc::odbc(), driver="{SQL Server Native Client 11.0}", server="dbserver.domain.com\\default,1234", Trusted_Connection = "yes", database = "test")  # assumes Microsoft SQL Server

dbWriteTable(con, "iris", iris, row.names = TRUE)      # create and populate a table (adding the row names as a separate columns used as row ID)

update <- dbSendQuery(con, 'update iris set "Sepal.Length"=?, "Sepal.Width"=?, "Petal.Length"=?, "Petal.Width"=?, "Species"=? WHERE row_names=?')

# create a modified version of `iris`
iris2 <- iris
iris2$Sepal.Length <- 5
iris2$Petal.Width[2] <- 1
iris2$row_names <- rownames(iris)  # use the row names as unique row ID

dbBind(update, iris2)  # send the updated data

dbClearResult(update)  # release the prepared statement

# now read the modified data - you will see the updates did work
data1 <- dbReadTable(con, "iris")

dbDisconnect(con)

This works only if you have a primary key which I created in the above example by using the row names which are a unique number increased by one for each row...

For more information about the odbc package I have used in the DBI dbConnect statement see: https://github.com/rstats-db/odbc

like image 148
R Yoda Avatar answered Nov 03 '22 09:11

R Yoda


Building on R Yoda's answer, I made myself the helper function below. This allows using a dataframe to specify update conditions.

While I built this to run transaction updates (i.e. single rows), it can in theory update multiple rows passing a condition. However, that's not the same as updating multiple rows using an input dataframe. Maybe somebody else can build on this...


dbUpdateCustom = function(x, key_cols, con, schema_name, table_name) {
  
  if (nrow(x) != 1) stop("Input dataframe must be exactly 1 row")
  if (!all(key_cols %in% colnames(x))) stop("All columns specified in 'key_cols' must be present in 'x'")
  
  # Build the update string --------------------------------------------------

  df_key     <- dplyr::select(x,  one_of(key_cols))
  df_upt     <- dplyr::select(x, -one_of(key_cols))
  
  set_str    <- purrr::map_chr(colnames(df_upt), ~glue::glue_sql('{`.x`} = {x[[.x]]}', .con = con))
  set_str    <- paste(set_str, collapse = ", ")
  
  where_str  <- purrr::map_chr(colnames(df_key), ~glue::glue_sql("{`.x`} = {x[[.x]]}", .con = con))
  where_str  <- paste(where_str, collapse = " AND ")
  
  update_str <- glue::glue('UPDATE {schema_name}.{table_name} SET {set_str} WHERE {where_str}')
  
  # Execute ------------------------------------------------------------------
  
  query_res <- DBI::dbSendQuery(con, update_str)
  DBI::dbClearResult(query_res)

  return (invisible(TRUE))
}

Where

  • x: 1-row dataframe that contains 1+ key columns, and 1+ update columns.
  • key_cols: character vector, of 1 or more column names that are the keys (i.e. used in the WHERE clause)
like image 4
mkirzon Avatar answered Nov 03 '22 09:11

mkirzon