Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading files from hdfs vs local directory

I am a beginner in hadoop. I have two doubts

1) how to access files stored in the hdfs? Is it same as using a FileReader in java.io and giving the local path or is it something else?

2) i have created a folder where i have copied the file to be stored in hdfs and the jar file of the mapreduce program. When I run the command in any directory

${HADOOP_HOME}/bin/hadoop dfs -ls

it just shows me all the files in the current dir. So does that mean all the files got added without me explicitly adding it?

like image 233
Ruppesh Nalwaya Avatar asked Oct 02 '22 11:10

Ruppesh Nalwaya


1 Answers

  1. Yes, it's pretty much the same. Read this post to read files from HDFS.

  2. You should keep in mind that HDFS is different than your local file system. With hadoop dfs you access the HDFS, not the local file system. So, hadoop dfs -ls /path/in/HDFS shows you the contents of the /path/in/HDFS directory, not the local one. That's why it's the same, no matter where you run it from.

If you want to "upload" / "download" files to/from HDFS you should use the commads:

hadoop dfs -copyFromLocal /local/path /path/in/HDFS and

hadoop dfs -copyToLocal /path/in/HDFS /local/path, respectively.

like image 88
vefthym Avatar answered Oct 27 '22 00:10

vefthym