Update 4
Per Greg's suggestion I've created one pair of image/text that shows the output from a 37k image to base64 encoded, using 100k chunks. Since the file is only 37k it's safe to say the loop only iterated once, so nothing was appended. The other pair shows the output from the same 37k image to base64 encoded, using 10k chunks. Since the file is 37k the loop iterated four times, and data was definitely appended.
Doing a diff on the two files shows that on the 10kb chunk file there's a large difference that begins on line 214 and ends on line 640.
Update 3
Here's where my code is now. Cleaned up a bit but still producing the same effect:
// Read data in chunks from the original file
[originalFile seekToEndOfFile];
NSUInteger fileLength = [originalFile offsetInFile];
[originalFile seekToFileOffset:0];
NSUInteger chunkSize = 100 * 1024;
NSUInteger offset = 0;
while(offset < fileLength) {
    NSData *chunk = [originalFile readDataOfLength:chunkSize];
    offset += chunkSize;
    // Convert the chunk to a base64 encoded string and back into NSData
    NSString *base64EncodedChunkString = [chunk base64EncodedString];
    NSData *base64EncodedChunk = [base64EncodedChunkString dataUsingEncoding:NSASCIIStringEncoding];
    // Write the encoded chunk to our output file
    [encodedFile writeData:base64EncodedChunk];
    // Cleanup
    base64EncodedChunkString = nil;
    base64EncodedChunk = nil;
    // Update progress bar
    [self updateProgress:[NSNumber numberWithInt:offset] total:[NSNumber numberWithInt:fileLength]];
}
Update 2
So it looks like files that are larger than 100 KB get scrambled, but files under 100 KB are fine. It's obvious that something is off on my buffer/math/etc, but I'm lost on this one. Might be time to call it a day, but I'd love to go to sleep with this one resolved.
Here's an example:
Update 1
After doing some testing I have found that the same code will work fine for a small image, but will not work for a large image or video of any size. Definitely looks like a buffer issue, right?
Hey there, trying to base64 encode a large file by looping through and doing it one small chunk at a time. Everything seems to work but the files always end up corrupted. I was curious if anyone could point out where I might be going wrong here:
    NSFileHandle *originalFile, *encodedFile;
    self.localEncodedURL = [NSString stringWithFormat:@"%@-base64.xml", self.localURL];
    // Open the original file for reading
    originalFile = [NSFileHandle fileHandleForReadingAtPath:self.localURL];
    if (originalFile == nil) {
        [self performSelectorOnMainThread:@selector(updateStatus:) withObject:@"Encoding failed." waitUntilDone:NO];
        return;
    }
    encodedFile = [NSFileHandle fileHandleForWritingAtPath:self.localEncodedURL];
    if (encodedFile == nil) {
        [self performSelectorOnMainThread:@selector(updateStatus:) withObject:@"Encoding failed." waitUntilDone:NO];
        return;
    }
    // Read data in chunks from the original file
    [originalFile seekToEndOfFile];
    NSUInteger length = [originalFile offsetInFile];
    [originalFile seekToFileOffset:0];
    NSUInteger chunkSize = 100 * 1024;
    NSUInteger offset = 0;
    do {
        NSUInteger thisChunkSize = length - offset > chunkSize ? chunkSize : length - offset;
        NSData *chunk = [originalFile readDataOfLength:thisChunkSize];
        offset += [chunk length];
        NSString *base64EncodedChunkString = [chunk base64EncodedString];
        NSData *base64EncodedChunk = [base64EncodedChunkString dataUsingEncoding:NSASCIIStringEncoding];
        [encodedFile writeData:base64EncodedChunk];
        base64EncodedChunkString = nil;
        base64EncodedChunk = nil;
    } while (offset < length);
                Base64 is an encoding algorithm that converts any characters, binary data, and even images or sound files into a readable string, which can be saved or transported over the network without data loss. The characters generated from Base64 encoding consist of Latin letters, digits, plus, and slash.
Convert Files to Base64 Just select your file or drag & drop it below, press the Convert to Base64 button, and you'll get a base64 string. Press a button – get base64. No ads, nonsense, or garbage. The input file can also be an mp3 or mp4.
To decode a file with contents that are base64 encoded, you simply provide the path of the file with the --decode flag. As with encoding files, the output will be a very long string of the original file. You may want to output stdout directly to a file.
Base64 is a binary to a text encoding scheme that represents binary data in an ASCII string format. It is designed to carry data stored in binary format across the network channels.
I wish I could give credit to GregInYEG, because his original point about padding was the underlying issue. With base64, each chunk has to be a multiple of 3. So this resolved the issue:
chunkSize = 3600
Once I had that, the corruption went away. But then I ran into memory leak issues, so I added the autorelease pool apprach taken from this post: http://www.cocoadev.com/index.pl?ReadAFilePieceByPiece
Final code:
// Read data in chunks from the original file
[originalFile seekToEndOfFile];
NSUInteger fileLength = [originalFile offsetInFile];
[originalFile seekToFileOffset:0];
// For base64, each chunk *MUST* be a multiple of 3
NSUInteger chunkSize = 24000;
NSUInteger offset = 0;
NSAutoreleasePool *chunkPool = [[NSAutoreleasePool alloc] init];
while(offset < fileLength) {
    // Read the next chunk from the input file
    [originalFile seekToFileOffset:offset];
    NSData *chunk = [originalFile readDataOfLength:chunkSize];
    // Update our offset
    offset += chunkSize;
    // Base64 encode the input chunk
    NSData *serializedChunk = [NSPropertyListSerialization dataFromPropertyList:chunk format:NSPropertyListXMLFormat_v1_0 errorDescription:NULL];
    NSString *serializedString =  [[NSString alloc] initWithData:serializedChunk encoding:NSASCIIStringEncoding];
    NSRange r = [serializedString rangeOfString:@"<data>"];
    serializedString = [serializedString substringFromIndex:r.location+7];
    r = [serializedString rangeOfString:@"</data>"];
    serializedString = [serializedString substringToIndex:r.location-1];
    // Write the base64 encoded chunk to our output file
    NSData *base64EncodedChunk = [serializedString dataUsingEncoding:NSASCIIStringEncoding];
    [encodedFile truncateFileAtOffset:[encodedFile seekToEndOfFile]];
    [encodedFile writeData:base64EncodedChunk];
    // Cleanup
    base64EncodedChunk = nil;
    serializedChunk = nil;
    serializedString = nil;
    chunk = nil;
    // Update the progress bar
    [self updateProgress:[NSNumber numberWithInt:offset] total:[NSNumber numberWithInt:fileLength]];
    // Drain and recreate the pool
    [chunkPool release];
    chunkPool = [[NSAutoreleasePool alloc] init];
}
[chunkPool release];
How are you converting back the base64 data to an image? Some implementations limit the maximum line length they will accept. Try inserting a line break every so many characters.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With