Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert pdf file to text file

Hi all I am working on Objective-C. my previous Question was How can I edit PDF files in an iOS application? after a lot of googling I found out the following. display the pdf in UIWebView, extract the data using C/javascript and edit it. I am still not sure about this procedure. now what I have planned is

1) display the pdf

2) when user wants to edit the pdf I covert the pdf to text and allow him to edit it

3) tryin to save wil convert the content back to pdf.

is this a gud way to proceed?? im k with step 1. now how do i convert pdf--> text and text-->pdf.

thanks in advance

like image 969
cancerian Avatar asked Nov 04 '22 17:11

cancerian


1 Answers

When you load a custom document type (doc, ppt, pdf, etc) into a UIWebView, the webview returns a nil HTML string, even via javascript. There's a few suggestions for extracting PDF text here.

But turning the string back into a PDF is different. If you want to retain the formatting of the original PDF, I'm rather sure that's impossible because NSAttributedString on iOS doesn't do much. But this will work for plain text or NSAttributedString, if its possible:

NSData *PDFDataFromString(NSString *str) {
    NSMutableData *data = [NSMutableData data];

    //Create an NSAttributedString for CoreText. If you find a way to translate
    //PDF into an NSAttributedString, you can skip this step and simply use an
    //NSAttributedString for this method's argument.

    NSAttributedString* string = [[[NSAttributedString alloc] initWithString:str] autorelease];

    //612 and 792 are the dimensions of the paper in pixels. (8.5" x 11")
    CGRect paperRect = CGRectMake(0.0, 0.0, 612, 792);

    CTFramesetterRef framesetter = CTFramesetterCreateWithAttributedString((CFAttributedStringRef) string);
    CGSize requiredSize = CTFramesetterSuggestFrameSizeWithConstraints(framesetter, CFRangeMake(0, [string length]), NULL, CGSizeMake(paperRect.size.width - 144, 1e40), NULL);

    //Subtract the top and bottom margins (72 and 72), so they aren't factored in page count calculations.
    NSUInteger pageCount = ceill(requiredSize.height / (paperRect.size.height - 144));
    CFIndex resumePageIndex = 0;
    UIGraphicsBeginPDFContextToData(data, paperRect, nil);

    for(NSUInteger i = 0; i < pageCount; i++) 
    {

    //After calculating the required number of pages, break up the string and
    //draw them into sequential pages.

        UIGraphicsBeginPDFPage();
        CGContextRef currentContext = UIGraphicsGetCurrentContext();
        CGContextSaveGState (currentContext);
        CGContextSetTextMatrix(currentContext, CGAffineTransformIdentity);
        CGMutablePathRef framePath = CGPathCreateMutable();

        //72 and 72 are the X and Y margins of the page in pixels.
        CGPathAddRect(framePath, NULL, CGRectInset(paperRect, 72.0, 72.0));

        CTFrameRef frameRef = CTFramesetterCreateFrame(framesetter, CFRangeMake(resumePageIndex, 0), framePath, NULL);
        resumePageIndex += CTFrameGetVisibleStringRange(frameRef).length;
        CGPathRelease(framePath);
        CGContextTranslateCTM(currentContext, 0, paperRect.size.height);
        CGContextScaleCTM(currentContext, 1.0, -1.0);
        CTFrameDraw(frameRef, currentContext);
        CFRelease(frameRef);    
        CGContextRestoreGState (currentContext);
    }
    CFRelease(framesetter);
    UIGraphicsEndPDFContext();
    return data;
}
like image 173
Isabel Avatar answered Nov 12 '22 12:11

Isabel