Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find Hadoop hdfs directory on my system?

How to find Hadoop HDFS directory on my system? I need this to run following command -

hadoop dfs -copyFromLocal <local-dir> <hdfs-dir>

In this command I don't knon my hdfs-dir.

Not sure if its helpful or not but I ran following command and got this output -

 hdfs dfs -ls
-rw-r--r--   3 popeye hdfs  127162942 2016-04-01 19:47 .

In hdfs-site.xml, I found following entry -

<property>
      <name>dfs.datanode.data.dir</name>
      <value>/hadoop/hdfs/data</value>
      <final>true</final>
</property>

I tried to run following command but it gives error -

[root@sandbox try]# hdfs dfs -copyFromLocal 1987.csv /hadoop/hdfs/data
copyFromLocal: `/hadoop/hdfs/data': No such file or directory

FYI - I am doing all this on hortonworks sandbox on azure server.

like image 653
N.. Avatar asked Apr 02 '16 20:04

N..


People also ask

How do I view a directory in hadoop?

To browse the HDFS file system in the HDFS NameNode UI, select Utilities > Browse the file system . The Browse Directory page is populated. Enter the directory path and click Go!.

Where is my HDFS path URL?

The Hadoop configuration file is default located in the /etc/hadoop/hdfs-site. xml. Here you can find the property name dfs. namenode.

Where is HDFS in hadoop?

HDFS has a primary NameNode, which keeps track of where file data is kept in the cluster. HDFS also has multiple DataNodes on a commodity hardware cluster -- typically one per node in a cluster. The DataNodes are generally organized within the same rack in the data center.

How do I list a directory in HDFS?

The following arguments are available with hadoop ls command: Usage: hadoop fs -ls [-d] [-h] [-R] [-t] [-S] [-r] [-u] <args> Options: -d: Directories are listed as plain files. -h: Format file sizes in a human-readable fashion (eg 64.0m instead of 67108864). -R: Recursively list subdirectories encountered.


3 Answers

Your approach is wrong or may be understanding is wrong

dfs.datanode.data.dir, is where you want to store your data blocks

If you type hdfs dfs -ls / you will get list of directories in hdfs. Then you can transfer files from local to hdfs using -copyFromLocal or -put to a particular directory or using -mkdir you can create new directory

Refer below link for more information

http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html

like image 51
BruceWayne Avatar answered Oct 20 '22 04:10

BruceWayne


If you run:

hdfs dfs -copyFromLocal foo.txt bar.txt

then the local file foo.txt will be copied into your own hdfs directory /user/popeye/bar.txt (where popeye is your username.) As a result, the following achieves the same:

hdfs dfs -copyFromLocal foo.txt /user/popeye/bar.txt

Before copying any file into hdfs, just be certain to create the parent directory first. You don't have to put files in this "home" directory, but (1) better to not clutter "/" with all sorts of files, and (2) following this convention will help prevent conflicts with other users.

like image 21
michael Avatar answered Oct 20 '22 04:10

michael


As per the first answer, I am elaborating it in detailed for Hadoop 1.x -

Suppose, you are running this script on pseudo distribution model, you will probably get one or two list of users(NameNodes) illustrated -

on our fully distribution model, first you have the administrator rights to perform these things and there will be N number of list of NameNodes(users).

So now we move to our point -

First reach to your Hadoop home directory and from there run this script -

bin/hadoop fs -ls /

Result will like this -

drwxr-xr-x   - xuiob78126arif supergroup          0 2017-11-30 11:20 /user

so here xuiob78126arif is my name node(master/user) and the NameNode(user) directory is -

/user/xuiob78126arif/

now you can go to your browser and search the address -

http://xuiob78126arif:50070

and from there you can get the Cluster Summary, NameNode Storage, etc.

Note : the script will provide results only in one condition, if at least any file or directory exist in DataNode otherwise you will get -

ls: Cannot access .: No such file or directory.

so, in that case you first put any file by bin/hadoop fs -put <source file full path>

and there after run the bin/hadoop fs -ls / script.

and now I hope, you have get a bit on your issue, thanks.

like image 1
ArifMustafa Avatar answered Oct 20 '22 04:10

ArifMustafa