Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the path to directory within Hadoop filesystem?

Recently I start learning Hadoop and Mahout. I want to know the path to directory within Hadoop filesystem directory.

In hadoop-1.2.1/conf/core-site.xml, I have specified:

<property>
  <name>hadoop.tmp.dir</name>
  <value>/Users/Li/File/Java/hdfstmp</value>
  <description>A base for other temporary directories.</description>
</property>

In Hadoop filesystem, I have the following directories:

lis-macbook-pro:Java Li$ hadoop fs -ls
Found 4 items
drwxr-xr-x   - Li supergroup          0 2013-11-06 17:25 /user/Li/output
drwxr-xr-x   - Li supergroup          0 2013-11-06 17:24 /user/Li/temp
drwxr-xr-x   - Li supergroup          0 2013-11-06 14:50 /user/Li/tweets-seq
-rw-r--r--   1 Li supergroup    1979173 2013-11-05 15:50 /user/Li/u.data

Now where is /user/Li/output directory?

I tried:

lis-macbook-pro:usr Li$ cd /user/Li/output
-bash: cd: /user/Li/output: No such file or directory

So I think /user/Li/output is a relative path not an absolute path.

Then I search for it in /Users/Li/File/Java/hdfstmp. There are two folders:

dfs

mapred

But still I cant find /user/Li/output within /Users/Li/File/Java/hdfstmp.

like image 591
Li' Avatar asked Nov 12 '13 19:11

Li'


People also ask

What is HDFS directory path?

You specify the location of a file in HDFS using a URL. In most cases, you use the hdfs:/// URL prefix (three slashes) with COPY, and then specify the file path. The hdfs scheme uses the Libhdfs++ library to read files and is more efficient than WebHDFS.

How do I get to root directory in hadoop?

All HDFS commands start with hadoop fs. Regular ls command on root directory will bring the files from root directory in the local file sytem. hadoop fs -ls / list the files from the root directory in HDFS. As you can see the output of local filesystem listing is different from what you see from the HDFS listing.

How do I go to a path in HDFS?

There is no cd (change directory) command in hdfs file system. You can only list the directories and use them for reaching the next directory. You have to navigate manually by providing the complete path using the ls command.

What is the root directory of HDFS?

The default HDFS root is set to /. Users to not have write privileges to this directory. To change the HDFS root to a path where you have write access, use the hdfs.


1 Answers

Your first call to hadoop fs -ls is a relative directory listing, for the current user typically rooted in a directory called /user/${user.name} in HDFS. So your hadoop fs -ls command is listing files / directories relative to this location - in your case /user/Li/

You should be able to assert this by running a aboolute listing and confirm the contents / output match: hadoop fs -ls /user/Li/

As these files are in HDFS, you will not be able to find them on the local filesystem - they are distributed across your cluster nodes as blocks (for real files), and metadata entries (for files and directories) in the NameNode.

like image 117
Chris White Avatar answered Sep 28 '22 01:09

Chris White