Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to collect data from future dplyr query after one fails?

Often when I try to collect a lot of data from a SQL Server database, I get a cryptic error message from the nanodbc driver. For example,

library(dplyr)
res <- collect(tbl(con, "result"))

Error in result_fetch(res@ptr, n, ...) : nanodbc/nanodbc.cpp:2525: 08S01: [Microsoft][ODBC Driver 11 for SQL Server]SSL Provider: [err

Result already cleared

From then on, I get the error

Error: 'SELECT * FROM "result" AS "zzz13" WHERE (0 = 1)' nanodbc/nanodbc.cpp:1587: 08S01: [Microsoft][ODBC Driver 11 for SQ

whenever I try to collect data from a table in the database. This continues until I restart my R session.

Has anyone seen this behavior before or could provide a way to fix it?

like image 942
wdkrnls Avatar asked Oct 21 '25 17:10

wdkrnls


2 Answers

One case in which this error definitely occurs : when there is a varchar type column in the table. you can check this simply by clicking on the connections pane in rstudio, going to the relevant table and checking the column types.

if the troublesome column is varchar_col1, then,

sometbl <- tbl(con, in_schema("schema_name","table_with_trouble"))
sometbl %>% head()

should reproduce your error.

I am not sure why this happens, but here is a workaround :

sometbl <- tbl(con, in_schema("schema_name","table_with_trouble")) %>%
    mutate(alt_col1 = as.character(varchar_col1)) %>%
    select(-varchar_col1)
sometbl %>% head()

should work.

like image 84
Prasanna Avatar answered Oct 24 '25 07:10

Prasanna


This issue still persists.

The function below allows a tbl object to be created for a sql table with a large varchar column. The object is created with the correctly ordered columns included. The tbl object can then be operated on as any other tbl.

library(odbc)
library(tidyverse)
library(dbplyr)

con <- dbConnect(
    odbc(),
    Driver = "SQL Server",
    Server = "SERVERNAME",
    Trusted_Connection = "True" # or auth spec
)

tbl_long_cols <- function(con, table){
       cols_sorted <- odbc::dbColumnInfo(
                odbc::dbSendQuery(con, paste0("SELECT * FROM ", table)
                )) |>
                arrange(desc(type)) |>
                pull(name)
  tbl(con, with_no_lock(table)) |>
  select(tidyselect::all_of(cols_sorted))
}


sql_table_obj <- tbl_long_cols(con, "server.tablename")

sql_table_obj |>
   filter(x > 12) ...


Relevant dbplyr Github issue

like image 44
hisspott Avatar answered Oct 24 '25 05:10

hisspott