I want to create a file in HDFS and write data in that. I used this code:
Configuration config = new Configuration(); FileSystem fs = FileSystem.get(config); Path filenamePath = new Path("input.txt"); try { if (fs.exists(filenamePath)) { fs.delete(filenamePath, true); } FSDataOutputStream fin = fs.create(filenamePath); fin.writeUTF("hello"); fin.close(); }
It creates the file, but it does't write anything in it. I searched a lot but didn't find anything. What is my problem? Do I need any permission to write in HDFS?
Thanks.
To write a file in HDFS, a client needs to interact with master i.e. namenode (master). Namenode provides the address of the datanodes (slaves) on which client will start writing the data. Client can directly write data on the datanodes, now datanode will create data write pipeline.
HDFS follows Write Once Read Many models. So, we can't edit files that are already stored in HDFS, but we can include them by again reopening the file. This design allows HDFS to scale to a large number of concurrent clients because the data traffic is spread across all the data nodes in the cluster.
To write a file inside the HDFS, the client first interacts with the NameNode. NameNode first checks for the client privileges to write a file. If the client has sufficient privilege and there is no file existing with the same name, NameNode then creates a record of a new file.
an alternative to @Tariq's asnwer you could pass the URI when getting the filesystem
import org.apache.hadoop.fs.FileSystem import org.apache.hadoop.conf.Configuration import java.net.URI import org.apache.hadoop.fs.Path import org.apache.hadoop.util.Progressable import java.io.BufferedWriter import java.io.OutputStreamWriter Configuration configuration = new Configuration(); FileSystem hdfs = FileSystem.get( new URI( "hdfs://localhost:54310" ), configuration ); Path file = new Path("hdfs://localhost:54310/s2013/batch/table.html"); if ( hdfs.exists( file )) { hdfs.delete( file, true ); } OutputStream os = hdfs.create( file, new Progressable() { public void progress() { out.println("...bytes written: [ "+bytesWritten+" ]"); } }); BufferedWriter br = new BufferedWriter( new OutputStreamWriter( os, "UTF-8" ) ); br.write("Hello World"); br.close(); hdfs.close();
Either define the HADOOP_CONF_DIR
environment variable to your Hadoop configuration folder or add the following 2 lines in your code :
config.addResource(new Path("/HADOOP_HOME/conf/core-site.xml")); config.addResource(new Path("/HADOOP_HOME/conf/hdfs-site.xml"));
If you don't add this, your client will try to write to the local FS, hence resulting into the permission denied exception.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With