Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

First hadoop project error: "Input path does not exist"

Tags:

hadoop

To setup a simple hadoop project I'm following this tutorial : http://ebiquity.umbc.edu/Tutorials/Hadoop/23%20-%20create%20the%20project.html

My hadoop single node seems to be running correctly.

When I specify the In folder using this code :

FileInputFormat.setInputPaths(conf, new Path("In"));

I receive this error:

13/03/03 22:05:27 ERROR security.UserGroupInformation: PriviledgedActionException as:DEVUSER cause:org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://localhost:9100/user/DEVUSER/In

Currently the In folder is created at C:\homedir\hadoop-1.0.4\In

Where do I need to create the "In" folder so that it appears in hdfs://localhost:9100/user/DEVUSER/In? Do I need to update an xml file to point to a folder on my local file system?

like image 276
blue-sky Avatar asked Mar 03 '13 22:03

blue-sky


2 Answers

You need to upload your input files to the HDFS file system first:

bin/hadoop fs -mkdir In

will create a directory named /user/DEVUSER/In in HDFS.

bin/hadoop fs -put *.txt In

will copy all *.txt files from the current directory to the cluster (HDFS).

You seem to have skipped the chapter Upload data from the tutorial. Follow it and your problem should be solved.

like image 55
harpun Avatar answered Oct 11 '22 03:10

harpun


If you dont want to upload the file to hdfs rather access it from your local system, then try setting you input path like this.

FileInputFormat.setInputPaths(conf, new Path("file://path of the In Folder on your File system "));
like image 22
Aparajith Chandran Avatar answered Oct 11 '22 04:10

Aparajith Chandran