I am using Hadoop 1.0.3 in a Pseudo-Distributed mode. And my conf/core-site.xml is set as follows:
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>mapred.child.tmp</name>
<value>/home/administrator/hadoop/temp</value>
</property>
</configuration>
So I believed that my default filesystem is set to HDFS. However, when I run the following code:
Configuration conf = new Configuration();
FileSystem fs = FileSystem.get(conf);
I thought that fs should be a DistributedFileSystem
instance. However, it turns out to be LocalFileSystem
instance.
But, if I run the following code:
Configuration conf = new Configuration();
conf.set("fs.default.name", "hdfs://localhost:9000");
FileSystem fs = FileSystem.get(conf);
Then I can get a DistributedFileSystem
fs.
Isn't my default FileSystem set to HDFS in core-site.xml? If not, how should I set that?
Eclipse environment doesn't know where the conf directory under Hadoop install directory to find the core-default.xml and core-site.xml unless these files are added to the Eclipse classpath to load first.
Since these are not added in the eclipse classpath, the default core-site.xml will be loaded from the jar file hadoop-*-core.jar (For eg: hadoop-0.20.2-core.jar for version 0.20) which has the local system as default file system and hence you are seeing LocalFileSystem
object instead of DistributedFileSystem
.
So, to add the <HADOOP_INSTALL>/conf
directory to eclipse project classpath, goto the project properties(project -> properties) -> Java build path -> Libraries tab -> Add external class folder -> Select the conf directory from <HADOOP_INSTALL>
The above should add your `/core-site.xml' to your eclipse classpath and all your settings should override the default ones.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With