I am newbie on this.Would like to know about the basic differences between hadoop distributed file system and network file system and what are the benefits of hdfs over nfs?
First lets start out with some definitions.
NFS (Network File system): A protocol developed that allows clients to access files over the network. NFS clients allow files to be accessed as if the files reside on the local machine, even though they reside on the disk of a networked machine.
HDFS (Hadoop Distributed File System): A file system that is distributed amongst many networked computers or nodes. HDFS is fault tolerant because it stores multiple replicas of files on the file system, the default replication level is 3.
So what is the big difference? Replication/Fault Tolerance. HDFS was designed to survive failures. NFS does not have any fault tolerance built in.
What are some benefits of HDFS over NFS? Other than fault tolerance, HDFS does support multiple replicas of files. This eliminates (or eases) the common bottleneck of many clients accessing a single file. Since files have multiple replicas, on different physical disks, reading performance scales better than NFS.
Note: Hadoop offers NFSGateway to bridge this difference
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With