Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hadoop mapreduce streaming from HBase

I'm building a Hadoop (0.20.1) mapreduce job that uses HBase (0.20.1) as both the data source and data sink. I would like to write the job in Python which has required me to use hadoop-0.20.1-streaming.jar to stream data to and from my Python scripts. This works fine if the data source/sink are HDFS files.

Does Hadoop support streaming from/to HBase for mapreduce?

like image 832
Richard Dorman Avatar asked Nov 10 '09 09:11

Richard Dorman


1 Answers

This seems to do what I want but it's not part of the Hadoop distribution. Any other suggestions or comments still welcome.

http://github.com/wanpark/hadoop-hbase-streaming

like image 107
Richard Dorman Avatar answered Oct 19 '22 23:10

Richard Dorman