Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to import/export hbase data via hdfs (hadoop commands)

I have saved my crawled data by nutch in Hbase whose file system is hdfs. Then I copied my data (One table of hbase) from hdfs directly to some local directory by command

hadoop fs -CopyToLocal /hbase/input ~/Documents/output

After that, I copied that data back to another hbase (other system) by following command

hadoop fs -CopyFromLocal ~/Documents/input /hbase/mydata

It is saved in hdfs and when I use list command in hbase shell, it shows it as another table i.e 'mydata' but when I run scan command, it says there is no table with 'mydata' name.

What is problem with above procedure? In simple words:

  1. I want to copy hbase table to my local file system by using a hadoop command
  2. Then, I want to save it directly in hdfs in another system by hadoop command
  3. Finally, I want the table to be appeared in hbase and display its data as the original table
like image 787
Hafiz Muhammad Shafiq Avatar asked Sep 18 '14 09:09

Hafiz Muhammad Shafiq


2 Answers

If you can use the Hbase command instead to backup hbase tables you can use the Hbase ExportSnapshot Tool which copies the hfiles,logs and snapshot metadata to other filesystem(local/hdfs/s3) using a map reduce job.

  • Take snapshot of the table

    $ ./bin/hbase shell hbase> snapshot 'myTable', 'myTableSnapshot-122112'

  • Export to the required file system

    $ ./bin/hbase class org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot MySnapshot -copy-to fs://path_to_your_directory

You can export it back from the local file system to hdfs:///srv2:8082/hbase and run the restore command from hbase shell to recover the table from the snapshot.

 $ ./bin/hbase shell
 hbase> disable 'myTable'
 hbase> restore_snapshot 'myTableSnapshot-122112'

Reference:Hbase Snapshots

like image 58
darkknight444 Avatar answered Nov 23 '22 17:11

darkknight444


If you want to export the table from one hbase cluster and import it to another, use any one of the following method:

Using Hadoop

  • Export

    $ bin/hadoop jar <path/to/hbase-{version}.jar> export \
         <tablename> <outputdir> [<versions> [<starttime> [<endtime>]]
    

    NOTE: Copy the output directory in hdfs from the source to destination cluster

  • Import

    $ bin/hadoop jar <path/to/hbase-{version}.jar> import <tablename> <inputdir>
    

Note: Both outputdir and inputdir are in hdfs.

Using Hbase

  • Export

    $ bin/hbase org.apache.hadoop.hbase.mapreduce.Export \
       <tablename> <outputdir> [<versions> [<starttime> [<endtime>]]]
    
  • Copy the output directory in hdfs from the source to destination cluster

  • Import

    $ bin/hbase org.apache.hadoop.hbase.mapreduce.Import <tablename> <inputdir>
    

    Reference: Hbase tool to export and import

like image 44
Nanda Avatar answered Nov 23 '22 15:11

Nanda