I'm on a Mac OSX machine and I'd like to run queries against a Hadoop db on a CentOS 6.6 machine. I can log in to the CentOS machine and run hive queries there. But I need to be able to run queries from my machine to troubleshoot connection issues.
Is there a way to install Beeline (the newer version of Hive CLI) or Hive on OSX without installing/configuring Hadoop? The information that I've seen says that you need to install Hadoop first, which seems like overkill just to test whether a remote database is listening for connections.
The primary difference between the two involves how the clients connect to Hive. The Hive CLI, which connects directly to HDFS and the Hive Metastore, and can be used only on a host with access to those services. Beeline, which connects to HiveServer2 and requires access to only one .
Configuring the ConnectionSpecify your Beeline CLI password. Specify your JDBC Hive host that is used for Hive Beeline. Specify your JDBC Hive port that is used for Hive Beeline. Specify your JDBC Hive database that you want to connect to with Beeline or specify a schema for an HQL statement to run with the Hive CLI.
$ brew install hive
worked well enough. I guess I'll leave this question up since I couldn't find the answer on the internet. 141 megs of disk space though, boo.
It is not necessary to install beeline/hive. All you have to do is collect the relevant jars from your system and copy them and place them in a single folder.
Suppose, we have a source system where you have beeline and a target system where you want to run beeline.
On the source system collect the relevant jars into one folder. The best way I have found to identify the exact jars involved is to use the jvm option: -verbose:class
I.e., you should be able to issue a java
command that will replicate a typical beeline command invocation on the source system.
Then copy those files into one folder on the target system. cd
to that folder to make the -classpath
reference later simple.
I use an HDP 2.5 Hortonworks distro. For me, the following invocation on the target system works:
java -Xmx1024m -classpath apache-log4j-extras-1.2.17.jar:avatica-1.8.0.2.5.0.0-1245.jar:calcite-core-1.2.0.2.5.0.0-1245.jar:calcite-linq4j-1.2.0.2.5.0.0-1245.jar:commons-cli-1.2.jar:commons-codec-1.4.jar:commons-collections-3.2.2.jar:commons-configuration-1.6.jar:commons-lang-2.6.jar:commons-logging-1.1.3.jar:curator-client-2.6.0.jar:curator-framework-2.6.0.jar:derby-10.10.2.0.jar:guava-14.0.1.jar:hadoop-annotations-2.7.3.2.5.0.0-1245.jar:hadoop-auth-2.7.3.2.5.0.0-1245.jar:hadoop-common-2.7.3.2.5.0.0-1245.jar:hadoop-mapreduce-client-core-2.7.3.2.5.0.0-1245.jar:hive-beeline-1.2.1000.2.5.0.0-1245.jar:hive-exec-1.2.1000.2.5.0.0-1245.jar:hive-jdbc-1.2.1000.2.5.0.0-1245.jar:hive-jdbc-1.2.1000.2.5.0.0-1245-standalone.jar:jce.jar:jline-2.12.jar:jsse.jar:log4j-1.2.16.jar:rt.jar:slf4j-log4j12-1.7.10.jar:sunec.jar:sunjce_provider.jar:super-csv-2.2.0.jar:xercesImpl-2.9.1.jar -Dhdp.version=2.5.0.0-1245 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.5.0.0-1245 -Dhadoop.log.dir=/home/userid -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/home/userid -Dhadoop.id.str=userid -Dhadoop.root.logger=INFO,console -Djava.library.path=:/home/userid -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Djava.util.logging.config.file=/home/userid/parquet-logging.properties -Dlog4j.configuration=beeline-log4j.properties -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.util.RunJar /home/userid/hive-beeline-1.2.1000.2.5.0.0-1245.jar org.apache.hive.beeline.BeeLine -n userid -p pass -u "jdbc:hive2://knox.company.com:8000/;ssl=true;transportMode=http;httpPath=gateway/tdcprd/hive"
Some of the parameters are probably not necessary, but I kept them because that is how it is done on the source system. You should use source system's java invocation as a reference pattern.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With