Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Where are my files(dir) stored when i used the hadoop fs -mkdir?

Tags:

hadoop

hdfs

I'm totally new to hadoop and just finished installing which took me 2 days... I'm now trying with the hadoop dfs command, but i just couldn't understand it, although i've been browsing for days, i couldnt find the answer to what i want to know. All the examples shows what the result is supposed to be, without explaining the real structure of it, so i will be happy if someone could assist me in understanding hadoop hdfs.

I've created a directory on the HDFS.

bin/hadoop fs -mkdir input

OK, i shall check on it with the ls command.

bin/hadoop fs -ls
Found 1 items
drwxr-xr-x   - hadoop supergroup          0 2012-07-30 11:08 input

OK, no problem, everything seems perfect.. BUT where is actually the HDFS data stored? I thought it would store in the my datanode directory (/home/hadoop/datastore), which was defined in core-site.xml under hadoop.tmp.dir, but it is not there..

Then i tried to view through the WEB-UI and i found that "input" was created under "/user/hadoop/" (/user/hadoop/input).

My questions are

  • (1) What are the datanode directory (hadoop.tmp.dir) used for, since it doesnt store everything i processed through dfs command?
  • (2) Everything created with dfs command goes to /user/XXX/ , how to change the value of it?
  • (3) I cant see anything when i try to access through normal linux command (ls /user/hadoop). Does /user/hadoop exists logically?

I'm sorry if my questions are stupid.. a newbie struggling to understand hadoop better..

Thank you in advance.

like image 722
user1561806 Avatar asked Jul 30 '12 03:07

user1561806


1 Answers

Hdfs is not a posix file system and you have to use hadoop api to read and view this file system. That's the reason you have to do hadoop fs -ls as you are using hadoop API to read files here. Data in hdfs are stored in blocks and is stored in all datanodes. Metadata about this file system is stored on Namenode. The data files you see in the directory "/home/hadoop/datastore " are blocks stored on individual datanode.

I think you should explore more about its file system in its tutorial. Yahoo, YDN tutorial on hdfs

like image 174
Animesh Raj Jha Avatar answered Nov 13 '22 12:11

Animesh Raj Jha