From within the Reduce setup method,I am trying to close a BufferedReader
object and getting a FileSystem
closed exception. It does not happen all the time. This is the piece of code I used to create the BufferedReader
.
String fileName = <some HDFS file path>
Configuration conf = new Configuration();
FileSystem fs = FileSystem.get(conf);
Path hdfsPath = new Path(filename);
FSDataInputStream in = fs.open(hdfsPath);
InputStreamReader inputStreamReader = new InputStreamReader(fileInputStream);
BufferedReader bufferedReader = new BufferedReader(inputStreamReader);
I read contents from the bufferedReader and once all the reading is done, I close it.
This is the piece of code that reads it
String line;
while ((line = reader.readLine()) != null) {
// Do something
}
This the piece of code that closes the reader.
if (bufferedReader != null) {
bufferedReader.close();
}
This is the stack trace for the exception that happens when I do a bufferedReader.close()
.
I, [2013-11-18T04:56:51.601135 #25683] INFO -- : attempt_201310111840_142285_r_000009_0: at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:565)
I, [2013-11-18T04:56:51.601168 #25683] INFO -- : attempt_201310111840_142285_r_000009_0: at org.apache.hadoop.hdfs.DFSInputStream.close(DFSInputStream.java:522)
I, [2013-11-18T04:56:51.601199 #25683] INFO -- : attempt_201310111840_142285_r_000009_0: at java.io.FilterInputStream.close(FilterInputStream.java:155)
I, [2013-11-18T04:56:51.601230 #25683] INFO -- : attempt_201310111840_142285_r_000009_0: at sun.nio.cs.StreamDecoder.implClose(StreamDecoder.java:358)
I, [2013-11-18T04:56:51.601263 #25683] INFO -- : attempt_201310111840_142285_r_000009_0: at sun.nio.cs.StreamDecoder.close(StreamDecoder.java:173)
I, [2013-11-18T04:56:51.601356 #25683] INFO -- : attempt_201310111840_142285_r_000009_0: at java.io.InputStreamReader.close(InputStreamReader.java:182)
I, [2013-11-18T04:56:51.601395 #25683] INFO -- : attempt_201310111840_142285_r_000009_0: at java.io.BufferedReader.close(BufferedReader.java:497)
I am not sure why this exception is happening. This is not multithreaded and so, I do not expect there to be a race condition of any sort. Can you please help me understand.
Thanks,
Venk
There is a little-known gotcha with the hadoop filesystem API: FileSystem.get
returns the same object for every invocation with the same filesystem. So if one is closed anywhere, they are all closed. You could debate the merits of this decision, but that's the way it is.
So, if you attempt to close your BufferedReader, and it tries to flush out some data it has buffered, but the underlying stream is connected to a FileSystem that is already closed, you'll get this error. Check your code for any other places you are closing a FileSystem object, and look for race conditions. Also, I believe Hadoop itself will at some point close the FileSystem, so to be safe, you should probably only be accessing it from within the Reducer's setup, reduce, or cleanup methods (or configure, reduce, and close, depending on which API you're using).
You have to use FileSystem.newInstance
to avoid using a shared connection (as described by Joe K). It will give you a unique, non-shared instance.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With