I'm totally new to hadoop and just finished installing which took me 2 days... I'm now trying with the hadoop dfs command, but i just couldn't understand it, although i've been browsing for days, i couldnt find the answer to what i want to know. All the examples shows what the result is supposed to be, without explaining the real structure of it, so i will be happy if someone could assist me in understanding hadoop hdfs.
I've created a directory on the HDFS.
bin/hadoop fs -mkdir input
OK, i shall check on it with the ls command.
bin/hadoop fs -ls
Found 1 items
drwxr-xr-x - hadoop supergroup 0 2012-07-30 11:08 input
OK, no problem, everything seems perfect.. BUT where is actually the HDFS data stored? I thought it would store in the my datanode directory (/home/hadoop/datastore), which was defined in core-site.xml under hadoop.tmp.dir, but it is not there..
Then i tried to view through the WEB-UI and i found that "input" was created under "/user/hadoop/" (/user/hadoop/input).
My questions are
I'm sorry if my questions are stupid.. a newbie struggling to understand hadoop better..
Thank you in advance.
Hdfs is not a posix file system and you have to use hadoop api to read and view this file system. That's the reason you have to do hadoop fs -ls as you are using hadoop API to read files here. Data in hdfs are stored in blocks and is stored in all datanodes. Metadata about this file system is stored on Namenode. The data files you see in the directory "/home/hadoop/datastore " are blocks stored on individual datanode.
I think you should explore more about its file system in its tutorial. Yahoo, YDN tutorial on hdfs
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With