Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Write a file in hdfs with Java

Tags:

java

hadoop

hdfs

I want to create a file in HDFS and write data in that. I used this code:

Configuration config = new Configuration();      FileSystem fs = FileSystem.get(config);  Path filenamePath = new Path("input.txt");   try {     if (fs.exists(filenamePath)) {         fs.delete(filenamePath, true);     }      FSDataOutputStream fin = fs.create(filenamePath);     fin.writeUTF("hello");     fin.close(); } 

It creates the file, but it does't write anything in it. I searched a lot but didn't find anything. What is my problem? Do I need any permission to write in HDFS?

Thanks.

like image 945
csperson Avatar asked Apr 14 '13 15:04

csperson


People also ask

How do I write to a file in HDFS?

To write a file in HDFS, a client needs to interact with master i.e. namenode (master). Namenode provides the address of the datanodes (slaves) on which client will start writing the data. Client can directly write data on the datanodes, now datanode will create data write pipeline.

How does HDFS store read and write files?

HDFS follows Write Once Read Many models. So, we can't edit files that are already stored in HDFS, but we can include them by again reopening the file. This design allows HDFS to scale to a large number of concurrent clients because the data traffic is spread across all the data nodes in the cluster.

What is the first step in a write process from a HDFS client?

To write a file inside the HDFS, the client first interacts with the NameNode. NameNode first checks for the client privileges to write a file. If the client has sufficient privilege and there is no file existing with the same name, NameNode then creates a record of a new file.


2 Answers

an alternative to @Tariq's asnwer you could pass the URI when getting the filesystem

import org.apache.hadoop.fs.FileSystem import org.apache.hadoop.conf.Configuration import java.net.URI import org.apache.hadoop.fs.Path import org.apache.hadoop.util.Progressable import java.io.BufferedWriter import java.io.OutputStreamWriter  Configuration configuration = new Configuration(); FileSystem hdfs = FileSystem.get( new URI( "hdfs://localhost:54310" ), configuration ); Path file = new Path("hdfs://localhost:54310/s2013/batch/table.html"); if ( hdfs.exists( file )) { hdfs.delete( file, true ); }  OutputStream os = hdfs.create( file,     new Progressable() {         public void progress() {             out.println("...bytes written: [ "+bytesWritten+" ]");         } }); BufferedWriter br = new BufferedWriter( new OutputStreamWriter( os, "UTF-8" ) ); br.write("Hello World"); br.close(); hdfs.close(); 
like image 99
Miguel Pereira Avatar answered Oct 14 '22 06:10

Miguel Pereira


Either define the HADOOP_CONF_DIR environment variable to your Hadoop configuration folder or add the following 2 lines in your code :

config.addResource(new Path("/HADOOP_HOME/conf/core-site.xml")); config.addResource(new Path("/HADOOP_HOME/conf/hdfs-site.xml")); 

If you don't add this, your client will try to write to the local FS, hence resulting into the permission denied exception.

like image 27
Tariq Avatar answered Oct 14 '22 08:10

Tariq