Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

NSXMLParser Memory Allocation Efficiency for the iPhone

I've recently been playing with code for an iPhone app to parse XML. Sticking to Cocoa, I decided to go with the NSXMLParser class. The app will be responsible for parsing 10,000+ "computers", all which contain 6 other strings of information. For my test, I've verified that the XML is around 900k-1MB in size.

My data model is to keep each computer in an NSDictionary hashed by a unique identifier. Each computer is also represented by a NSDictionary with the information. So at the end of the day, I end up with a NSDictionary containing 10k other NSDictionaries.

The problem I'm running into isn't about leaking memory or efficient data structure storage. When my parser is done, the total amount of allocated objects only does go up by about 1MB. The problem is that while the NSXMLParser is running, my object allocation is jumping up as much as 13MB. I could understand 2 (one for the object I'm creating and one for the raw NSData) plus a little room to work, but 13 seems a bit high. I can't imaging that NSXMLParser is that inefficient. Thoughts?

Code...

The code to start parsing...

NSXMLParser *parser = [[NSXMLParser alloc] initWithData: data];
[parser setDelegate:dictParser];
[parser parse];
output = [[dictParser returnDictionary] retain];        
[parser release];
[dictParser release];

And the parser's delegate code...

-(void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qualifiedName attributes:(NSDictionary *)attributeDict {

    if(mutableString)
    {
        [mutableString release];
        mutableString = nil;

    }

    mutableString = [[NSMutableString alloc] init];     

}

-(void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string { 
    if(self.mutableString)
    {

        [self.mutableString appendString:string];

    }
}

-(void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName {

    if([elementName isEqualToString:@"size"]){
        //The initial key, tells me how many computers
        returnDictionary = [[NSMutableDictionary alloc] initWithCapacity:[mutableString intValue]];
}

    if([elementName isEqualToString:hashBy]){
    //The unique identifier
        if(mutableDictionary){
            [mutableDictionary release];
            mutableDictionary = nil;
    }       

        mutableDictionary = [[NSMutableDictionary alloc] initWithCapacity:6];

        [returnDictionary setObject:[NSDictionary dictionaryWithDictionary:mutableDictionary] forKey:[NSMutableString stringWithString:mutableString]];
}

    if([fields containsObject:elementName]){
        //Any of the elements from a single computer that I am looking for
        [mutableDictionary setObject:mutableString forKey:elementName];
}
}

Everything initialized and released correctly. Again, I'm not getting errors or leaking. Just inefficient.

Thanks for any thoughts!

like image 768
Staros Avatar asked Jan 22 '10 15:01

Staros


1 Answers

NSXMLParser is a memory hog:

  1. it is not a real streaming parser: initWithURL: will download the full xml before processing it. For memory use this is bad as it have to allocate the memory for the full xml wich can’t be reclaimed until the end of parse. For performance it’s also bad, as you cannot interleave the IO intensive part of downloading and CPU intensive part of parsing.
  2. it will not release memory. It seems that strings/dictionaries created during the parsing is kept around until the end of parse. I’ve tried to improve it with creative use of NSAutoreleasePool but without any success.

Alternatives are libxml and AQXMLParser which is an NSXMLParser compatible wrapper around libxml, or ObjectiveXML.

See my blog article for more details.

like image 79
mfazekas Avatar answered Sep 28 '22 01:09

mfazekas