I've recently been playing with code for an iPhone app to parse XML. Sticking to Cocoa, I decided to go with the NSXMLParser class. The app will be responsible for parsing 10,000+ "computers", all which contain 6 other strings of information. For my test, I've verified that the XML is around 900k-1MB in size.
My data model is to keep each computer in an NSDictionary hashed by a unique identifier. Each computer is also represented by a NSDictionary with the information. So at the end of the day, I end up with a NSDictionary containing 10k other NSDictionaries.
The problem I'm running into isn't about leaking memory or efficient data structure storage. When my parser is done, the total amount of allocated objects only does go up by about 1MB. The problem is that while the NSXMLParser is running, my object allocation is jumping up as much as 13MB. I could understand 2 (one for the object I'm creating and one for the raw NSData) plus a little room to work, but 13 seems a bit high. I can't imaging that NSXMLParser is that inefficient. Thoughts?
Code...
The code to start parsing...
NSXMLParser *parser = [[NSXMLParser alloc] initWithData: data];
[parser setDelegate:dictParser];
[parser parse];
output = [[dictParser returnDictionary] retain];
[parser release];
[dictParser release];
And the parser's delegate code...
-(void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qualifiedName attributes:(NSDictionary *)attributeDict {
if(mutableString)
{
[mutableString release];
mutableString = nil;
}
mutableString = [[NSMutableString alloc] init];
}
-(void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string {
if(self.mutableString)
{
[self.mutableString appendString:string];
}
}
-(void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName {
if([elementName isEqualToString:@"size"]){
//The initial key, tells me how many computers
returnDictionary = [[NSMutableDictionary alloc] initWithCapacity:[mutableString intValue]];
}
if([elementName isEqualToString:hashBy]){
//The unique identifier
if(mutableDictionary){
[mutableDictionary release];
mutableDictionary = nil;
}
mutableDictionary = [[NSMutableDictionary alloc] initWithCapacity:6];
[returnDictionary setObject:[NSDictionary dictionaryWithDictionary:mutableDictionary] forKey:[NSMutableString stringWithString:mutableString]];
}
if([fields containsObject:elementName]){
//Any of the elements from a single computer that I am looking for
[mutableDictionary setObject:mutableString forKey:elementName];
}
}
Everything initialized and released correctly. Again, I'm not getting errors or leaking. Just inefficient.
Thanks for any thoughts!
NSXMLParser is a memory hog:
initWithURL
: will download the full
xml before processing it. For memory
use this is bad as it have to
allocate the memory for the full xml
wich can’t be reclaimed until the
end of parse. For performance it’s
also bad, as you cannot interleave
the IO intensive part of downloading
and CPU intensive part of parsing.NSAutoreleasePool
but without any
success.Alternatives are libxml and AQXMLParser which is an NSXMLParser compatible wrapper around libxml, or ObjectiveXML.
See my blog article for more details.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With