Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Perl Thrift client to Hive?

Tags:

perl

thrift

hive

I'd like to connect to a Hadoop-based Hive datastore using Perl. Hive allows connection using the Thrift interface (http://wiki.apache.org/hadoop/Hive/HiveClient), and there is a Thrift implementation for Perl (e.g. http://metacpan.org/pod/Thrift::XS). However, the only Thrift client I found is a Cassandra client.

Any ideas if such a client exists, or how to create it? Maybe it's even possible to connect without explicitly defining one?

(PS - there is also an ODBC/JDBC interface to Hive, but installing these modules is a headache, and would be a last resort)

thanks!

like image 513
etov Avatar asked Mar 13 '11 11:03

etov


1 Answers

After some reading (most notably: blog.fingertap.org/?1a253760), I succeeded in creating a Perl Thrift client, and using it to query my server.

Steps:

  1. Download, build and install Thrift: http://incubator.apache.org/thrift/download/. Don't forget to make install the code in lib/perl.

  2. Download the infrastructure's .thrift files from Hive's SVN, under the dist of your Hive installation (http://svn.apache.org/viewvc/hive/). The files I have used: fb303.thrift, queryplan.thrift, hive_metastore.thrift and thrift_hive.thrift. I have located them manually, but there might be better ways of doing that.

  3. Generate the Perl code using thrift: thrift -r --gen perl hive_service.thrift
    Note: I had to build the directory tree for the required includes, and use the -I directive to this tree's root. I got the required structure from the errors thrift threw at me, but again, there might be more elegant ways of doing that.

Now, the following Perl code, written around the lines of the python example in Hive's client Wiki, works for me:

use Thrift;  

use Thrift::Socket;  
use Thrift::FramedTransport;  
use Thrift::BinaryProtocol;  
use lib <LOCATION OF GENERATED PERL CODE>;  
use ThriftHive;  

# init variables ($host, $port, $query)
#  
  
my $socket = Thrift::Socket->new($host, $port);  
my $transport = Thrift::BufferedTransport->new($socket);  
my $protocol = Thrift::BinaryProtocol->new($transport);  
my $client = ThriftHiveClient->new($protocol);  
  
eval {$transport->open()}; #do something with Exceptions  
eval {$client->execute($query)};  
  
for (my $i = 0; $i < $count; $i++)  
{         
   my $row;  
   eval {$row = $client->fetchOne()};  
  
   #use $row  
}  
  
$transport->close();  
like image 176
etov Avatar answered Sep 28 '22 06:09

etov