I have saved my crawled data by nutch in Hbase whose file system is hdfs. Then I copied my data (One table of hbase) from hdfs directly to some local directory by command
hadoop fs -CopyToLocal /hbase/input ~/Documents/output
After that, I copied that data back to another hbase (other system) by following command
hadoop fs -CopyFromLocal ~/Documents/input /hbase/mydata
It is saved in hdfs and when I use list
command in hbase shell, it shows it as another table i.e 'mydata' but when I run scan
command, it says there is no table with 'mydata' name.
What is problem with above procedure? In simple words:
If you can use the Hbase command instead to backup hbase tables you can use the Hbase ExportSnapshot Tool which copies the hfiles,logs and snapshot metadata to other filesystem(local/hdfs/s3) using a map reduce job.
Take snapshot of the table
$ ./bin/hbase shell
hbase> snapshot 'myTable', 'myTableSnapshot-122112'
Export to the required file system
$ ./bin/hbase class org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot MySnapshot -copy-to fs://path_to_your_directory
You can export it back from the local file system to hdfs:///srv2:8082/hbase and run the restore command from hbase shell to recover the table from the snapshot.
$ ./bin/hbase shell
hbase> disable 'myTable'
hbase> restore_snapshot 'myTableSnapshot-122112'
Reference:Hbase Snapshots
If you want to export the table from one hbase cluster and import it to another, use any one of the following method:
Using Hadoop
Export
$ bin/hadoop jar <path/to/hbase-{version}.jar> export \
<tablename> <outputdir> [<versions> [<starttime> [<endtime>]]
NOTE: Copy the output directory in hdfs from the source to destination cluster
Import
$ bin/hadoop jar <path/to/hbase-{version}.jar> import <tablename> <inputdir>
Note: Both outputdir and inputdir are in hdfs.
Using Hbase
Export
$ bin/hbase org.apache.hadoop.hbase.mapreduce.Export \
<tablename> <outputdir> [<versions> [<starttime> [<endtime>]]]
Copy the output directory in hdfs from the source to destination cluster
Import
$ bin/hbase org.apache.hadoop.hbase.mapreduce.Import <tablename> <inputdir>
Reference: Hbase tool to export and import
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With