I have Hadoop installed in this location
/usr/local/hadoop$
Now I want to list the files inside the dfs. The command I used is :
hduser@ubuntu:/usr/local/hadoop$ bin/hadoop dfs -ls
This gave me the files in the dfs
Found 3 items
drwxr-xr-x - hduser supergroup 0 2014-03-20 03:53 /user/hduser/gutenberg
drwxr-xr-x - hduser supergroup 0 2014-03-24 22:34 /user/hduser/mytext-output
-rw-r--r-- 1 hduser supergroup 126 2014-03-24 22:30 /user/hduser/text.txt
Next time, I tried the same command in a different manner
hduser@ubuntu:/usr/local/hadoop$ hadoop dfs -ls
It also gave me the same result.
Could some one please explain why both are working despite of executing the ls command from different folders. I hope you guys understood my question.Just explain me difference between these two :
hduser@ubuntu:/usr/local/hadoop$ bin/hadoop dfs -ls
hduser@ubuntu:/usr/local/hadoop$ hadoop dfs -ls
These hadoop hdfs commands can be run on a pseudo distributed cluster or from any of the VM's like Hortonworks, Cloudera, etc.
Hadoop provides two types of commands to interact with File System; hadoop fs or hdfs dfs .
Use the hdfs dfs -ls command to list files in Hadoop archives. Run the hdfs dfs -ls command by specifying the archive directory location. Note that the modified parent argument causes the files to be archived relative to /user/ .
Run the command % $HADOOP_INSTALL/hadoop/bin/start-dfs.sh on the node you want the Namenode to run on. This will bring up HDFS with the Namenode running on the machine you ran the command on and Datanodes on the machines listed in the slaves file mentioned above.
In unix an executable file can be executed in two ways, either by giving the absolute/relative path or commands in system executables path(path should be specified in PATH variable)
When you execute bin/hadoop dfs -ls
should be inside the directory /usr/local/hadoop. Or /usr/local/hadoop/bin/hadoop dfs -ls
will also work
There is one environment variable PATH in unix which keeps in the list of executable location by default it keeps the following path /usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:
. Whenever we execute any command like ls, mkdir etc it is taking from the one location in PATH variable. When you give the command hadoop(it will be taken from the path /usr/local/hadoop/bin/). Since you have specified the path /usr/local/hadoop/bin/ in PATH variable. Use the following command to check the value of your PATH variable
echo $PATH
You set a hadoop global path HADOOP_HOME
in your ~/.bashrc
file so that Hadoop commands will works in anywhere in Terminal.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With