We are using Java SAX to parser on really big XML files. Our characters
implementation looks like following:
@Override
public void characters(char ch[], int start, int length) throws SAXException {
String value = String.copyValueOf(ch, start, length);
...
}
(ch[]
arrays passed by SAX tend to be pretty long)
But we are recently getting some performance issues and the profiler shows us that over 20% of our CPU usage is above invocation of String.copyValueOf
(which invoked new String(ch,start,length)
under the hood).
Is there any more effective way to obtain a String from array of characters, start index and length than String.copyValueOf(ch, start, length)
or new String(ch,start,length)
?
char[] arr = { 'p', 'q', 'r', 's' }; The method valueOf() will convert the entire array into a string. String str = String. valueOf(arr);
Java String array is used to hold fixed number of Strings. String array is very common in simple java programs, specially among beginners to java and to test some specific scenarios. Even java main method argument is string array - public static void main(String[] args) .
By new keyword : Java String is created by using a keyword “new”. For example: String s=new String(“Welcome”); It creates two objects (in String pool and in heap) and one reference variable where the variable 's' will refer to the object in the heap.
Good question, but I'm sure, that answer is no.
This is because any String
object construction uses arrays copy method. It can not be constructed directly on exist array, because String
object must be immutable and its internal string array representation is encapsulated from outer changes.
Furthermore, in your case you have a deal with a fragment of some array. It is impossible to build String
object on the fragment of another array in any way.
As stated by @Andremoniy, if you want to use a String object, it always has to be created and contents get copied into it.
The only possibility to speed up your parser is to reduce the number of newly build string-objects to a minimum.
I doupt, that every element in your xml-structure contains raw data between start and end tags.
Therefor I would suggest to only create the strings if you are within an element where the data is of interest. Moreover I would suggest to limit the possible elements somehow. For example by hierarchie-level or the parent element to reduce the number of stringcompaisons. But this depends on the xml-structure.
protected boolean readChars = false;
protected int level = -1;
@Override
public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
++level;
if (level == 4) {
if (qName.equalsIgnoreCase("TextElement")) {
readChars = true;
}
}
}
@Override
public void characters(char ch[], int start, int length) throws SAXException {
if (readChars) {
String value = String.copyValueOf(ch, start, length);
...
readChars = false;
}
}
@Override
public void endElement(String uri, String localName, String qName) throws SAXException {
--level;
}
Possibly in conjunction, that characters
might be called more than once inside one single tag, holding a StringBuilder on element level might be appropiate. This does a System.arrayCopy
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With