I am trying to append to a file on an hdfs on a single node cluster. I also tried on a 2 node cluster but get the same exceptions.
In hdfs-site, I have dfs.replication
set to 1. If I set dfs.client.block.write.replace-datanode-on-failure.policy
to DEFAULT
I get the following exception
java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010], original=[10.10.37.16:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
If I follow the recommendation in the comment for the configuration in hdfs-default.xml for extremely small clusters (3 nodes or less) and set dfs.client.block.write.replace-datanode-on-failure.policy
to NEVER
I get the following exception:
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException): Cannot append to file/user/hadoop/test. Name node is in safe mode.
The reported blocks 1277 has reached the threshold 1.0000 of total blocks 1277. The number of live datanodes 1 has reached the minimum number 0. In safe mode extension. Safe mode will be turned off automatically in 3 seconds.
Here's how I try to append:
Configuration conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://MY-MACHINE:8020/user/hadoop");
conf.set("hadoop.job.ugi", "hadoop");
FileSystem fs = FileSystem.get(conf);
OutputStream out = fs.append(new Path("/user/hadoop/test"));
PrintWriter writer = new PrintWriter(out);
writer.print("hello world");
writer.close();
Is there something I am doing wrong in the code? maybe, there is something missing in the configuration? Any help will be appreciated!
EDIT
Even though that dfs.replication
is set to 1
, when I check the status of the file through
FileStatus[] status = fs.listStatus(new Path("/user/hadoop"));
I find that status[i].block_replication
is set to 3
. I don't think that this the problem because when I changed the value of dfs.replication
to 0
I got a relevant exception. So apparently it does indeed obey the value of dfs.replication
but to be on the safe side, is there a way to change the block_replication
value per file?
Problems with small files and HDFS A small file is one which is significantly smaller than the HDFS block size (default 64MB). If you're storing small files, then you probably have lots of them (otherwise you wouldn't turn to Hadoop), and the problem is that HDFS can't handle lots of files.
Hadoop -getmerge command is used to merge multiple files in an HDFS(Hadoop Distributed File System) and then put it into one single output file in our local file system. We want to merge the 2 files present inside are HDFS i.e. file1. txt and file2. txt, into a single file output.
The first method to handle small files consists on grouping them in Hadoop Archive (HAR). However, it can lead to read performance problems. The other solution was SequenceFiles with file names as keys and content as values.
As I mentioned in the edit. Even though the dfs.replication
is set to 1
, fileStatus.block_replication
is set to 3.
A possible solution is to run
hadoop fs -setrep -w 1 -R /user/hadoop/
Which will change the replication factor for each file recursively in the given directory. Documentation for the command can be found here.
What to be done now is to look why the value in hdfs-site.xml is ignored. And how to force the value 1
to be the default.
EDIT
It turns out that the dfs.replication
property has to be set in the Configuration
instance too, otherwise it requests that the replication factor for the file be the default which is 3 regardless of the value set in hdfs-site.xml
Adding to the code the following statement will solve it.
conf.set("dfs.replication", "1");
I also faced the same exception as you initially posted and I solved the problem thanks to your comments (set dfs.replication to 1).
But I don't understand something, what happens if I do have replication? In that case isn't it possible to append to a file?
I will appreciate your answer and if you had an experience with it.
Thanks
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With