Various websites (like Hortonworks) recommend to not configure RAID for HDFS setups mainly because of two reasons:
It is recommended to use RAID on NameNode.
But what about implementing RAID on each DataNode storage disk?
HDFS clusters do not benefit from using RAID (Redundant Array of Independent Disks) for datanode storage (although RAID is recommended for the namenode's disks to protect against corruption of it's metadata). The redundancy that RAID provides is not needed, since HDFS handles it by replication between nodes.
Since the namenode is a single-point-of-failure in HDFS, it requires a more reliable hardware setup. Therefore, the use of RAID is recommended on namenodes.
Raid (redundant array of independent disks) is a way of storing the same data in different places (thus, redundantly) on multiple hard disks. The Hadoop Distributed File System (HDFS) is the primary storage system used by Hadoop applications.
RAID 0 is the most affordable type of redundant disk configuration and is relatively easy to set up. Still, it does not include any redundancy, fault tolerance, or party in its composition. Hence, problems on any of the disks in the array can result in complete data loss.
RAID is used for two purposes. Depending on the RAID configuration you can get:
HDFS has similar mechanisms built in software. HDFS splits files into chunks (so-called file blocks) which are replicated across multiple datanodes and stored on their local filesystems. Usually, datanodes have multiple disks which are individually mounted (JBOD). A datanode should distribute its file blocks across all its disks / local filesystems.
This ensures:
Since HDFS is taking care of fault-tolerance and "striped" reading, there is no need to use RAID underneath an HDFS. Using RAID will only be more expensive, offer less storage, and also be slower (depending on the concrete RAID config).
Since the namenode is a single-point-of-failure in HDFS, it requires a more reliable hardware setup. Therefore, the use of RAID is recommended on namenodes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With