I'm working on adding HiveServer2 support to my company's R data-access package. I'm curious what the best way of generating an R Thrift client would be. I'm considering writing an R wrapper around the Java Thrift client, similar to what rhbase does, but I'd prefer a pure R solution, if possible.
Things to note:
beeline
client in some R goodness.Thrift comes in the architectural part of Hive, Thrift is a protocol for the application which were developed in a different programming languages to communicate. So the Thrift server sits in a hive services layer and this is the one which receives the request or hive queries from client programs.
Thrift is an interface definition language and binary communication protocol used for defining and creating services for numerous programming languages. It was developed at Facebook for "scalable cross-language services development" and as of 2020 is an open source project in the Apache Software Foundation.
The exact scope of this question may be too broad for Stackoverflow and the asker confirmed he abandoned this quest, but for future readers this is probably the thing to look for:
From R you can connect to Hive with JDBC.
This is not exactly what the asker came for, but it should serve the purpose in most cases.
The key part in the solution for this would be the RJDBC package, here is some example code found on the Cloudera Community
library(DBI)
library(rJava)
library(RJDBC)
hadoop.class.path = list.files(path=c("/usr/hdp/2.4.0.0-169/hadoop"),pattern="jar", full.names=T);
hive.class.path = list.files(path=c("/usr/hdp/current/hive-client/lib"),pattern="jar", full.names=T);
hadoop.lib.path = list.files(path=c("/usr/hdp/current/hive-client/lib"),pattern="jar",full.names=T);
mapred.class.path = list.files(path=c("/usr/hdp/current/hadoop-mapreduce-client/lib"),pattern="jar",full.names=T);
cp = c(hive.class.path,hadoop.lib.path,mapred.class.path,hadoop.class.path)
drv <- JDBC("org.apache.hive.jdbc.HiveDriver","hive-jdbc.jar",identifier.quote="`")
conn <- dbConnect(drv, "jdbc:hive2://ixxx:10000/default", "hive", "hive")
show_databases <- dbGetQuery(conn, "show databases")
Full disclosure: I am an employee of cloudera.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With