Loading csv data into Hbase [closed]

Tags:

hbase

I am very new to hadoop and hbase and have some conceptual questions that are tripping me up during every tutorial I've found.

I have hadoop and hbase running on a single node within a ubuntu VM on my win 7 system. I have a csv file that I would like to load into a single hbase table.

The columns are: loan_number, borrower_name, current_distribution_date, loan_amount

I know that I need to write a MapReduce job to load this said csv file into hbase. The following tutorial describes the Java needed to write this MapReduce job. http://salsahpc.indiana.edu/ScienceCloud/hbase_hands_on_1.htm

What I'm missing is:

Where do I save these files and where do I compile them? Should I compile this on my win 7 machine running visual studio 12 and then move it to the ubuntu vm?

I read this SO question and answers but I guess I'm still missing the basics: Loading CSV File into Hbase table using MapReduce

I can't find anything covering these basic hadoop/hbase logistics. Any help would be greatly appreciated.

212

asked Dec 17 '12 00:12

bjoern

1 Answers

There is no need to code a MapReduce job to bulk load data into HBase. There are several ways to bulk load data into HBase:

1) Use HBase tools like importtsv and completebulkload http://hbase.apache.org/book/arch.bulk.load.html

2) Use Pig to bulk load data. Example:

A = LOAD '/hbasetest.txt' USING PigStorage(',') as 
      (strdata:chararray, intdata:long);
STORE A INTO 'hbase://mydata'
        USING org.apache.pig.backend.hadoop.hbase.HBaseStorage(
              'mycf:intdata');

3) Do it programatically using the HBase API. I got a small project called hbaseloader that loads files into a HBase table (table it has just one ColumnFamily with the content of the file). Take a look at it, you just need to define the structure of your table and modified the code to read a csv file and parse it.

4) Do it programatically using a MapReduce job like in the example you mentioned.

113

answered Oct 06 '22 01:10

Diego Pino

Related questions
                            
                                pom.xml for Hadoop 2.6.0
                            
                                Hadoop on Windows Building/ Installation Error
                            
                                Parquet predicate pushdown
                            
                                Hadoop Hive web interface options
                            
                                How does Hive decide when to use map reduce and when not to?
                            
                                Requests hang when using Hiveserver2 Thrift Java client
                            
                                Hive Buckets-understanding TABLESAMPLE(BUCKET X OUT OF Y)
                            
                                Messed up sed syntactics in hadoop startup script after reinstalling JVM
                            
                                build hadoop 2.2 on windows
                            
                                HDFS file watcher
                            
                                Tuning Hive Queries That Uses Underlying HBase Table
                            
                                Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password) during ambari hadoop installation
                            
                                Concat Avro files using avro-tools
                            
                                Is there a way to transpose data in Hive
                            
                                Spark with HBASE vs Spark with HDFS
                            
                                Hive: SELECT AS and GROUP BY
                            
                                How Java Hadoop Mapper can send multiple values
                            
                                HDFS error put: `input': No such file or directory
                            
                                Apache Hadoop vs Google Bigdata
                            
                                Hadoop Reducer Values in Memory?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With