Hadoop backup and recovery tool and guidance

Question

I am new to hadoop need to learn details about backup and recovery. I have revised oracle backup and recovery will it help in hadoop?From where should I start

brandon.bell · Accepted Answer

There are a few options for backup and recovery. As s.singh points out, data replication is not DR.

HDFS supports snapshotting. This can be used to prevent user errors, recover files, etc. That being said, this isn't DR in the event of a total failure of the Hadoop cluster. (http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsSnapshots.html)

Your best bet is keeping off-site backups. This can be to another Hadoop cluster, S3, etc and can be performed using distcp. (http://hadoop.apache.org/docs/stable1/distcp2.html), (https://wiki.apache.org/hadoop/AmazonS3)

Here is a Slideshare by Cloudera discussing DR (http://www.slideshare.net/cloudera/hadoop-backup-and-disaster-recovery)

Kumar · Answer

Hadoop is designed to work on the big cluster with 1000's of nodes. Data loss is possibly less. You can increase the replication factor to replicate the data into many nodes across the cluster.

Refer Data Replication

For Namenode log backup, Either you can use the secondary namenode or Hadoop High Availability

Secondary Namenode

Secondary namenode will take backup for the namnode logs. If namenode fails then you can recover the namenode logs (which holds the data block information) from the secondary namenode.

High Availability

High Availability is a new feature to run more than one namenode in the cluster. One namenode will be active and the other one will be in standby. Log saves in both namenode. If one namenode fails then the other one becomes active and it will handle the operation.

But also we need to consider for Backup and Disaster Recovery in most cases. Refer @brandon.bell answer.

Hadoop backup and recovery tool and guidance

Tags:

hadoop

Anand Kamathi

2 Answers

brandon.bell

Kumar

Recent Activity

Donate For Us

Hadoop backup and recovery tool and guidance

Tags:

hadoop

Anand Kamathi

2 Answers

brandon.bell

Kumar

Related questions

Recent Activity

Donate For Us