Since ByteArrayInputStream is limited to 2GB, is there any alternate solution that allows me to store the whole contents of a 2.3GB (and possibly larger) file into an InputStream to be read by Stax2?
Current code:
            XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
            XMLStreamReader xmlStreamReader = xmlInputFactory.createXMLStreamReader(in); //ByteArrayInputStream????
            try
            {
                SchemaFactory factory = SchemaFactory.newInstance("http://www.w3.org/2001/XMLSchema");
                Schema schema = factory.newSchema(new StreamSource(schemaInputStream));
                Validator validator = schema.newValidator();
                validator.validate(new StAXSource(xmlStreamReader));
            }
            finally
            {
                xmlStreamReader.close();
            }
For performance tuning, variable in must not come from disk. I have plenties of RAM.
Provide more memory to your JVM (usually using -Xmx / -Xms ) or don't load all the data into memory. For many operations on huge amounts of data there are algorithms which don't need access to all of it at once. One class of such algorithms are divide and conquer algorithms.
Java BufferedReader() & FileReader() Implementation BufferedReader reads text from a character-input stream, buffering characters so as to provide for the efficient reading of characters, arrays, and lines, and it is wrapped around the FileReader method, which is the actual method reading the specified text file.
The whole point of StAX2 is that you do not need to read the file in to memory. You can just supply the source, and let the StAX StreamReader pull the data as it needs to.
What additional constraints do you have that you are not showing in your question?
If you have lots of memory, and you want to get good performance, just wrap your InputStream with a large byte buffer, and let the buffer do the buffering for you:
// 4 meg buffer on the stream
InputStream buffered = new BufferedInputStream(schemaInputStream, 1024 * 1024 * 4);
An alternative to solving this in Java is to create a RAMDisk, and to store the file on that, which would remove the problem from Java, where your basic limitation is that you can only have just less than Integer.MAX_VALUE values in a single array.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With