I'm trying to connect to Amazon Athena via JDBC and pool
:
What has worked so far:
library(RJDBC)
library(DBI)
library(pool)
library(dplyr)
library(dbplyr)
drv <- RJDBC::JDBC('com.amazonaws.athena.jdbc.AthenaDriver', '/opt/jdbc/AthenaJDBC41-1.1.0.jar')
pool_instance <- dbPool(
drv = drv,
url = "jdbc:awsathena://athena.us-west-2.amazonaws.com:443/",
user = "me",
s3_staging_dir = "s3://somedir",
password = "pwd"
)
mydata <- DBI::dbGetQuery(pool_instance, "SELECT *
FROM myDB.myTable
LIMIT 10")
mydata
---> Works fine. Correct data is beeing returned.
That does not work:
pool_instance %>% tbl("myDB.myTable") %>% head(10)
# Error in .verify.JDBC.result(r, "Unable to retrieve JDBC result set for ", :
# Unable to retrieve JDBC result set for SELECT *
# FROM "myDB.myTable" AS "zzz2"
# WHERE (0 = 1) ( Table myDB.myTable not found. Please check your query.)
The problem here is that Athena expects the following syntax as SQL:
Either:
SELECT *
FROM "myDB"."myTable"
Or:
SELECT *
FROM myDB.myTable
So basically, by passing the string "myDB.myTable"
:
pool_instance %>% tbl("myDB.myTable") %>% head(10)
The following syntax is being used:
SELECT *
FROM "myDB.myTable"
which results in the following error since such table doesn't exist:
# Error in .verify.JDBC.result(r, "Unable to retrieve JDBC result set for ", :
# Unable to retrieve JDBC result set for SELECT *
# FROM "myDB.myTable" AS "zzz6"
# WHERE (0 = 1) ( Table myDB.myTable not found. Please check your query.)
What I have tried:
So therefore I have tried to pass either "myDB"."myTable"
or myDB.myTable
to tbl()
unsuccessfully:
I have tried to use capture.output(cat('\"myDB\".\"myTable\"'))
:
pool_instance %>% tbl(capture.output(cat('\"myDB\".\"myTable\"'))) %>% head(10)
# Error in .verify.JDBC.result(r, "Unable to retrieve JDBC result set for ", :
# Unable to retrieve JDBC result set for SELECT *
# FROM """myDB"".""myTable""" AS "zzz4"
# WHERE (0 = 1) ( Table ""myDB"".""myTable"" not found. Please check your query.)
pool_instance %>% tbl(noquote("myDB"."myTable") %>% head(10)
# Error in UseMethod("as.sql") :
# no applicable method for 'as.sql' applied to an object of class "noquote"
In Athena homepage, navigate to the menu on the left. Go to 'Data sources' tab. Click on the name of the Data source that contains the database you want to delete. Click on the name of the database you want to delete.
Open the Athena console at https://console.aws.amazon.com/athena/ . In the query editor, next to Tables and views, choose Create, and then choose AWS Glue crawler. Follow the steps on the Add crawler page of the AWS Glue console to add a crawler. For more information, see Using AWS Glue crawlers.
[RESTRICT|CASCADE] Specifying CASCADE causes the database and all its tables to be dropped.
You can use dbplyr::in_schema
:
pool_instance %>% tbl(in_schema("myDB", "myTable")) %>% head(10)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With