Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Run a Local file system directory as input of a Mapper in cluster

I gave an input to the mapper from a local filesystem.It is running successfully from eclipse,But not running from the cluster as it is unable to find the local input path saying:input path does not exist.Please can anybody help me how to give a local file path to a mapper so that it can run in the cluster and i can get the output in hdfs

like image 589
user1326784 Avatar asked Apr 11 '12 14:04

user1326784


1 Answers

This is a very old question. Recently faced the same issue. I am not aware of how correct this solution is it worked for me though. Please bring to notice if there are any drawbacks of this.Here's what I did.

Reading a solution from the mail-archives, I realised if i modify fs.default.name from hdfs://localhost:8020/ to file:/// it can access the local file system. However, I didnt want this for all my mapreduce jobs. So I made a copy of core-site.xml in a local system folder (same as the one from where I would submit my MR jar to hadoop jar).

and in my Driver class for MR I added,

Configuration conf = new Configuration();
conf.addResource(new Path("/my/local/system/path/to/core-site.xml"));
conf.addResource(new Path("/usr/lib/hadoop-0.20-mapreduce/conf/hdfs-site.xml"));

The MR takes input from local system and writes the output to hdfs:

like image 160
Suvarna Pattayil Avatar answered Nov 03 '22 02:11

Suvarna Pattayil