I have a CSV file that contains over 80,000 rows and 100 columns. I'm trying to handle loading /accessing the CSV data in the most performance-efficient way possible. Right now my CSVParser loads the data into an NSArray, but it's extremely slow/sluggish; this is a problem as I hope to handle this parsing/loading on a mobile device: the iPhone.
Any suggestions for an alternate method would be much appreciated. Thank you
UPDATE:
For future reference/discussion, I now have the following attempt:
// Mark time the parser starts
NSTimeInterval start = [NSDate timeIntervalSinceReferenceDate];
// Parse the CSV file
[parser parse];
NSTimeInterval end = [NSDate timeIntervalSinceReferenceDate];
// Print how long the parsing took
NSLog(@"raw difference: %f", (end-start));
// Copy the allLines array from the parsing delegate
NSArray *allOfTheRows = [NSArray arrayWithArray:d.allLines];
NSLog( @"There are %i lines in the csv file", [allOfTheRows count]);
NSFileManager *f = [[NSFileManager alloc] init];
NSString *filePath = @"/Users/..../rawData"; // This is of course not a literal location...
// Archive the array as NSData
NSData *someData = [NSKeyedArchiver archivedDataWithRootObject:allOfTheRows];
// Write the data to a file
[f createFileAtPath:filePath contents:someData attributes:nil];
/*
If I were to load the data from the iPhone, i'd copy the newly created someData file above to my application's mainBundle, and then unarchive the NSData to an array on the iPhone
*/
// Read the data back as an array
NSData *readData = [NSData dataWithContentsOfFile:filePath];
NSArray *bigCollectionReadBack = [NSKeyedUnarchiver unarchiveObjectWithData:readData];
I had similar problems with CSV parsing on the iPhone. I ended up doing the parsing on the Mac and writing out a binary file containing the array of struct data. It used to take 120 seconds to parse/load the CSV file on the iPhone 4 but the binary file loads in under 10 milliseconds.
EDIT - To elaborate a bit more, on the Mac I read the CSV file, organize the data into several arrays of structs then write out the data to a binary file using fwrite. On iOS I read the binary file using fread (one read for the header to get size info, and a second read for the data) into an array of structs of the right size. One of the larger files is 2.2MB and it takes 66 msec to read from the flash into RAM using fread.
2011-11-15 17:32:35.304 -[BinFile initWithFile:] 001953f0 file Metro
2011-11-15 17:32:35.370 -[BinFile initWithFile:] read 2217385 bytes (Metro)
I'm not sure what you mean by "alternate method" but if you have a huge dataset another method won't help you. What helps you is optimizing your current load process
You could do the following:
UPDATE:
You didn't say the file remains in the device's resource folder and cannot change (like downloading from an external source). If this is the case go with progrmrs solution.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With