Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

cant access string methods in dbplyr

Tags:

r

dbplyr

I am trying to use str_detect, str_replace, str_replace_all methods in dbplyr with oracle as the beckend database but cant seem to access this methods.

here is the error:

db_tbl %>% mutate(COMMENTS_NEW = str_detect(COMMENTS,"[^[:alnum:]///' ]", "")) %>% show_query()
Error: str_detect() is not available in this SQL variant

I have reinstalled all the packages but still no use. However, I can see that it was implemented in dbplyr 1.2.0 see here?

tried with grepl which translates to this:

db_tbl %>% mutate(COMMENTS_NEW = grepl(COMMENTS,pattern = '[^[:alnum:]]')) %>% show_query()
<SQL>
Named arguments ignored for SQL greplSELECT grepl("COMMENTS", '[^[:alnum:]]' AS "pattern") AS "COMMENTS_NEW"
FROM ("schema".table) 

also returns the error. here is the traceback:

20.
stop(structure(list(message = "<SQL> 'SELECT * FROM (SELECT \"COMMENTS\", \"TYPE_28\", grepl(\"COMMENTS\", '[^[:alnum:]]' AS \"pattern\") AS \"COMMENTS_NEW\"\nFROM (\"schema\".table) ) \"zzz3\" WHERE ROWNUM <= 6.0'\n nanodbc/nanodbc.cpp:1587: HY000: [Oracle][ODBC][Ora]ORA-00907: missing right parenthesis\n ", call = NULL, cppstack = NULL), class = c("odbc::odbc_error", "C++Error", "error", "condition")))
19.
new_result(connection@ptr, statement)
18.
OdbcResult(connection = conn, statement = statement)
17.
dbSendQuery(con, sql)
16.
dbSendQuery(con, sql)
15.
db_collect.DBIConnection(x$src$con, sql, n = n, warn_incomplete = warn_incomplete)
14.
db_collect(x$src$con, sql, n = n, warn_incomplete = warn_incomplete)
13.
collect.tbl_sql(x, n = n)
12.
collect(x, n = n)
11.
as.data.frame(collect(x, n = n))
10.
as.data.frame.tbl_sql(head(x, n + 1))
9.
as.data.frame(head(x, n + 1))
8.
trunc_mat(x, n = n, width = width, n_extra = n_extra)
7.
format.tbl(x, ..., n = n, width = width, n_extra = n_extra)
6.
format(x, ..., n = n, width = width, n_extra = n_extra)
5.
paste0(..., "\n")
4.
cat(paste0(..., "\n"), sep = "")
3.
cat_line(format(x, ..., n = n, width = width, n_extra = n_extra))
2.
print.tbl_sql(x)
1.
function (x, ...) UseMethod("print")(x)

heres my session:

R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252    LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C                            LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] dbplot_0.3.2           pool_0.1.4.2           dbplyr_1.4.2           DBI_1.0.0              odbc_1.1.6             data.table_1.11.8     
 [7] qdap_2.3.0             RColorBrewer_1.1-2     qdapTools_1.3.3        qdapRegex_0.7.2        qdapDictionaries_1.0.7 textclean_0.9.3       
[13] drlib_0.1.0            lubridate_1.7.4        ggrepel_0.8.0          fpp2_2.3               expsmooth_2.3          fma_2.3               
[19] forecast_8.5           recipes_0.1.5          textSummary_0.1.0      scales_1.0.0           janitor_1.1.1          forcats_0.3.0         
[25] stringr_1.4.0          dplyr_0.8.1            purrr_0.2.5            readr_1.2.1            tidyr_0.8.2            tibble_2.1.1          
[31] ggplot2_3.2.0          tidyverse_1.2.1       

loaded via a namespace (and not attached):
  [1] openNLPdata_1.5.3-4 colorspace_1.4-1    class_7.3-14        rprojroot_1.3-2     fs_1.2.6            base64enc_0.1-3    
  [7] rstudioapi_0.8      remotes_2.0.2       bit64_0.9-7         prodlim_2018.04.18  fansi_0.4.0         xml2_1.2.0         
 [13] splines_3.5.0       knitr_1.20          pkgload_1.0.2       jsonlite_1.6        venneuler_1.1-0     rJava_0.9-10       
 [19] broom_0.5.1         compiler_3.5.0      httr_1.3.1          backports_1.1.2     assertthat_0.2.1    Matrix_1.2-14      
 [25] lazyeval_0.2.2      cli_1.1.0           later_0.8.0         prettyunits_1.0.2   tools_3.5.0         igraph_1.2.2       
 [31] NLP_0.2-0           gtable_0.3.0        glue_1.3.1          reshape2_1.4.3      Rcpp_1.0.1          slam_0.1-43        
 [37] cellranger_1.1.0    fracdiff_1.4-2      urca_1.3-0          gdata_2.18.0        nlme_3.1-137        lmtest_0.9-36      
 [43] timeDate_3043.102   gower_0.1.2         gender_0.5.2        ps_1.2.1            xlsxjars_0.6.1      testthat_2.0.1     
 [49] rvest_0.3.2         devtools_2.0.1      gtools_3.8.1        XML_3.98-1.16       xlsx_0.6.1          MASS_7.3-49        
 [55] zoo_1.8-5           ipred_0.9-8         hms_0.4.2           parallel_3.5.0      yaml_2.2.0          quantmod_0.4-14    
 [61] curl_3.3            memoise_1.1.0       gridExtra_2.3       rpart_4.1-13        stringi_1.4.3       desc_1.2.0         
 [67] tseries_0.10-46     plotrix_3.7-4       TTR_0.23-4          pkgbuild_1.0.2      openNLP_0.2-6       lava_1.6.4         
 [73] chron_2.3-53        rlang_0.4.0         pkgconfig_2.0.2     bitops_1.0-6        lattice_0.20-35     processx_3.2.0     
 [79] bit_1.1-14          tidyselect_0.2.5    plyr_1.8.4          magrittr_1.5        R6_2.4.0            generics_0.0.2     
 [85] pillar_1.3.1        haven_2.0.0         withr_2.1.2         xts_0.11-2          survival_2.41-3     RCurl_1.95-4.11    
 [91] nnet_7.3-12         modelr_0.1.2        crayon_1.3.4        utf8_1.1.4          wordcloud_2.6       usethis_1.4.0      
 [97] grid_3.5.0          readxl_1.1.0        callr_3.0.0         blob_1.1.1          reports_0.1.4       digest_0.6.18      
[103] tm_0.7-5            munsell_0.5.0       sessioninfo_1.1.1   quadprog_1.5-5     
like image 432
Shery Avatar asked Nov 06 '22 14:11

Shery


1 Answers

This is not really an answer, but, instead, just a simple workaround.

The problem is dbplyr:: cannot manage to create an adequate SQL clause (SQL has no functions with names str_detect or grepl), so it throws the towel (and an error).

In both the expressions, you get error because dbplyr cannot translate neitherstringr::str_detect()norbase::grepl()to a valid SQL expression. One way to get almost what you want is tocollect()before youfilter()`:

db_tbl %>% 
  mutate(COMMENTS_NEW = str_detect(COMMENTS,"[^[:alnum:]///' ]", "")) %>% 
  show_query()
db_tbl %>% 
  mutate(COMMENTS_NEW = str_detect(COMMENTS,"[^[:alnum:]///' ]", "")) %>% 
  collect()
db_tbl %>% 
  mutate(COMMENTS_NEW = grepl(COMMENTS,pattern = '[^[:alnum:]]')) %>% 
  show_query()
db_tbl %>% 
  mutate(COMMENTS_NEW = grepl(COMMENTS,pattern = '[^[:alnum:]]')) %>% 
  collect()

However, if you place collect() before...

db_tbl %>% 
  collect() %>%
  mutate(COMMENTS_NEW = str_detect(COMMENTS,"[^[:alnum:]///' ]", ""))
db_tbl %>% 
  collect() %>%
  mutate(COMMENTS_NEW = grepl(COMMENTS,pattern = '[^[:alnum:]]'))

your remote tables become local tables, on which you can apply str_detect() peacefully.

As a side comment, show_query() ceases to be meaningful for obvious reasons.

like image 153
Marcelo Ventura Avatar answered Nov 16 '22 23:11

Marcelo Ventura