Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Huge memory consumption while parsing JSON and creating NSManagedObjects

I'm parsing a JSON file on an iPad which has about 53 MB. The parsing is working fine, I'm using Yajlparser which is a SAX parser and have set it up like this:

    NSData *data = [NSData dataWithContentsOfFile:path options:NSDataReadingMappedAlways|NSDataReadingUncached error:&parseError];
    YAJLParser *parser = [[YAJLParser alloc] init];
    parser.delegate = self;
    [parser parse:data];

Everything worked fine until now, but the JSON-file became bigger and now I'm suddenly experiencing memory warnings on the iPad 2. It receives 4 Memory Warnings and then just crashes. On the iPad 3 it works flawlessly without any mem warnings.

I have started profiling it with Instruments and found a lot of CFNumber allocations (I have stopped Instruments after a couple of minutes, I had it run before until the crash and the CFNumber thing was at about 60 mb or more).

CFNumber allocations

After opening the CFNumber detail, it showed up a huge list of allocations. One of them showed me the following:

CFNumber alloc 1

and another one here:

CFNumber alloc 2

So what am I doing wrong? And what does that number (e.g. 72.8% in the last image) stand for? I'm using ARC so I'm not doing any Release or Retain or whatever.

Thanks for your help. Cheers

EDIT: I have already asked the question about how to parse such huge files here: iPad - Parsing an extremely huge json - File (between 50 and 100 mb)
So the parsing itself seems to be fine.

like image 875
gasparuff Avatar asked Aug 27 '13 13:08

gasparuff


2 Answers

See Apple's Core Data documentation on Efficiently Importing Data, particularly "Reducing Peak Memory Footprint".

You will need to make sure you don't have too many new entities in memory at once, which involves saving and resetting your context at regular intervals while you parse the data, as well as using autorelease pools well.

The general sudo code would be something like this:

while (there is new data) {
    @autoreleasepool {
        importAnItem();
        if (we have imported more than 100 items) {
            [context save:...];
            [context reset];
        }
    }
}

So basically, put an autorelease pool around your main loop or parsing code. Count how many NSManagedObject instances you have created, and periodically save and reset the managed object context to flush these out of memory. This should keep your memory footprint down. The number 100 is arbitrary and you might want to experiment with different values.

Because you are saving the context for each batch, you may want to import into a temporary copy of your store in case something goes wrong and leaves you with a partial import. When everything is finished you can overwrite the original store.

like image 161
Mike Weller Avatar answered Nov 15 '22 22:11

Mike Weller


Try to use [self.managedObjectContext refreshObject:obj refreshChanges:NO] after certain amount of insert operations. This will turn NSManagedObjects into faults and free up some memory.

Apple Docs on provided methods

like image 1
Arthur Shinkevich Avatar answered Nov 15 '22 20:11

Arthur Shinkevich