Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to escape Athena database.table using pool package?

Tags:

database

r

dbplyr

I'm trying to connect to Amazon Athena via JDBC and pool:

What has worked so far:

library(RJDBC)
library(DBI)
library(pool)
library(dplyr)
library(dbplyr)

drv <- RJDBC::JDBC('com.amazonaws.athena.jdbc.AthenaDriver', '/opt/jdbc/AthenaJDBC41-1.1.0.jar')

pool_instance <- dbPool(
  drv = drv,
  url = "jdbc:awsathena://athena.us-west-2.amazonaws.com:443/",
  user = "me",
  s3_staging_dir = "s3://somedir",
  password = "pwd"
)

mydata <- DBI::dbGetQuery(pool_instance, "SELECT * 
                                          FROM myDB.myTable
                                          LIMIT 10")

mydata

---> Works fine. Correct data is beeing returned.

That does not work:

pool_instance %>% tbl("myDB.myTable") %>% head(10)
# Error in .verify.JDBC.result(r, "Unable to retrieve JDBC result set for ",  : 
#   Unable to retrieve JDBC result set for SELECT *
# FROM "myDB.myTable" AS "zzz2"
# WHERE (0 = 1) ( Table myDB.myTable not found. Please check your query.)

The problem here is that Athena expects the following syntax as SQL:

Either:

SELECT *
FROM "myDB"."myTable"

Or:

SELECT *
FROM myDB.myTable

So basically, by passing the string "myDB.myTable":

pool_instance %>% tbl("myDB.myTable") %>% head(10)

The following syntax is being used:

SELECT *
FROM "myDB.myTable"

which results in the following error since such table doesn't exist:

# Error in .verify.JDBC.result(r, "Unable to retrieve JDBC result set for ",  : 
#  Unable to retrieve JDBC result set for SELECT *
# FROM "myDB.myTable" AS "zzz6"
# WHERE (0 = 1) ( Table myDB.myTable not found. Please check your query.)

What I have tried:

So therefore I have tried to pass either "myDB"."myTable" or myDB.myTable to tbl() unsuccessfully:

I have tried to use capture.output(cat('\"myDB\".\"myTable\"')):

pool_instance %>% tbl(capture.output(cat('\"myDB\".\"myTable\"'))) %>% head(10)
# Error in .verify.JDBC.result(r, "Unable to retrieve JDBC result set for ",  : 
#   Unable to retrieve JDBC result set for SELECT *
# FROM """myDB"".""myTable""" AS "zzz4"
# WHERE (0 = 1) ( Table ""myDB"".""myTable"" not found. Please check your query.)

pool_instance %>% tbl(noquote("myDB"."myTable") %>% head(10)
# Error in UseMethod("as.sql") : 
#   no applicable method for 'as.sql' applied to an object of class "noquote"
like image 266
Googme Avatar asked Jun 18 '18 19:06

Googme


People also ask

How do I drop an Athena database?

In Athena homepage, navigate to the menu on the left. Go to 'Data sources' tab. Click on the name of the Data source that contains the database you want to delete. Click on the name of the database you want to delete.

How do I make Athena table with glue crawler?

Open the Athena console at https://console.aws.amazon.com/athena/ . In the query editor, next to Tables and views, choose Create, and then choose AWS Glue crawler. Follow the steps on the Add crawler page of the AWS Glue console to add a crawler. For more information, see Using AWS Glue crawlers.

How do you delete all tables in Athena?

[RESTRICT|CASCADE] Specifying CASCADE causes the database and all its tables to be dropped.


1 Answers

You can use dbplyr::in_schema:

pool_instance %>% tbl(in_schema("myDB", "myTable")) %>% head(10)
like image 148
Scarabee Avatar answered Oct 15 '22 09:10

Scarabee