Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bad File Descriptor IOException in Java using javax.xml

I'm using the standard javax.xml package to parse some XML files on a linux machine. My code is as follows:

try 
{
    // Prepare parser
    DocumentBuilder documentBuilder = documentBuilderFactory
        .newDocumentBuilder();
    Document document = documentBuilder.parse(file.getAbsolutePath()); // This is line 397
    XPath xPath = xPathFactory.newXPath();
    ...
}
catch(IOException e) { ... }

A single DocumentBuilderFactory is accessed by multiple threads, as is a single XPathFactory, I believe this to be acceptable usage. I occasionally see the following error when parsing an XML file using the above code.

java.io.IOException: Bad file descriptor
        at java.io.FileInputStream.readBytes(Native Method)
        at java.io.FileInputStream.read(FileInputStream.java:229)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:229)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:246)
        at org.apache.xerces.impl.XMLEntityManager$RewindableInputStream.read(Unknown Source)
        at org.apache.xerces.impl.XMLEntityManager.setupCurrentEntity(Unknown Source)
        at org.apache.xerces.impl.XMLVersionDetector.determineDocVersion(Unknown Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
        at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
        at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
        at javax.xml.parsers.DocumentBuilder.parse(Unknown Source)
        at mypackage.MyXmlParser.parseFile(MyXmlParser.java:397)
        at mypackage.MyXmlParser.access$500(MyXmlParser.java:51)
        at mypackage.MyXmlParser$1.call(MyXmlParser.java:337)
        at mypackage.MyXmlParser$1.call(MyXmlParser.java:328)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:284)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:665)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:690)
        at java.lang.Thread.run(Thread.java:799)

I occasionally (~10% of the time) see the following additional text:

Caused by:
java.io.IOException: Bad file descriptor
        at org.apache.xml.serializer.ToStream.flushWriter(ToStream.java:260)
        at org.apache.xml.serializer.ToXMLStream.endDocument(ToXMLStream.java:191)
        at org.apache.xalan.transformer.TransformerIdentityImpl.endDocument(TransformerIdentityImpl.java:983)
        at org.apache.xml.serializer.TreeWalker.traverse(TreeWalker.java:174)
        at org.apache.xalan.transformer.TransformerIdentityImpl.transform(TransformerIdentityImpl.java:410)
        ... 9 more

When I inspect the files manually, I can see no difference between the files that fail and the files that pass. I can confirm the files that pass are valid XML and have no special characters or premature endings.

Does anyone know why this might be happening, and how I can avoid it?

> java -version
java version "1.5.0"
Java(TM) 2 Runtime Environment, Standard Edition (build pxa64dev-20061002a (SR3) )
IBM J9 VM (build 2.3, J2RE 1.5.0 IBM J9 2.3 Linux amd64-64 j9vmxa6423-20061001 (JIT enabled)
J9VM - 20060915_08260_LHdSMr
JIT  - 20060908_1811_r8
GC   - 20060906_AA)
JCL  - 20061002
like image 282
Ina Avatar asked Nov 04 '22 01:11

Ina


1 Answers

It looks like an issue with concurrent threads.

The error can be somewhere outside the codelet which you show us. But also with DocumentBuilderFactory and XPathFactory I'm not sure if they are thread-safe; it is not mentioned in the documentation.

For a first test I recommend to you to put the whole code for parsing XML files into a synchronized {} clause. If this solves your problem, then it definitively is a multithread problem. In this case you have to find out the smallest part of code which must be synchronized.

like image 85
Johanna Avatar answered Nov 07 '22 22:11

Johanna