I've spent all day trying to get hyperlinks metadata from PDFs in my iPad application. The CGPDF* APIs are a true nightmare, and the only piece of information I've found on the net about all this is that I have to look for an "Annots" dictionary, but I just can't find it in my PDFs.
I even used the old Voyeur Xcode sample to inspect my test PDF file, but no trace of this "Annots" dictionary...
You know, this is a feature I see on every PDF reader - this same question has been asked multiple times here with no real practical answers. I usually never ask for sample code directly but apparently this time I really need it... anyone got this working, possibly with sample code?
Update: I just realized the guy who has done my testing PDF had just inserted an URL as text, and not a real annotation. He tried putting an annotation and my code works now... But that's not what I need, so it seems I'll have to analyze text and search for URLs. But that's another story...
Update 2: So I finally came up with some working code. I'm posting it here so hopefully it'll help someone. It assumes the PDF document actually contains annotations.
for(int i=0; i<pageCount; i++) { CGPDFPageRef page = CGPDFDocumentGetPage(doc, i+1); CGPDFDictionaryRef pageDictionary = CGPDFPageGetDictionary(page); CGPDFArrayRef outputArray; if(!CGPDFDictionaryGetArray(pageDictionary, "Annots", &outputArray)) { return; } int arrayCount = CGPDFArrayGetCount( outputArray ); if(!arrayCount) { continue; } for( int j = 0; j < arrayCount; ++j ) { CGPDFObjectRef aDictObj; if(!CGPDFArrayGetObject(outputArray, j, &aDictObj)) { return; } CGPDFDictionaryRef annotDict; if(!CGPDFObjectGetValue(aDictObj, kCGPDFObjectTypeDictionary, &annotDict)) { return; } CGPDFDictionaryRef aDict; if(!CGPDFDictionaryGetDictionary(annotDict, "A", &aDict)) { return; } CGPDFStringRef uriStringRef; if(!CGPDFDictionaryGetString(aDict, "URI", &uriStringRef)) { return; } CGPDFArrayRef rectArray; if(!CGPDFDictionaryGetArray(annotDict, "Rect", &rectArray)) { return; } int arrayCount = CGPDFArrayGetCount( rectArray ); CGPDFReal coords[4]; for( int k = 0; k < arrayCount; ++k ) { CGPDFObjectRef rectObj; if(!CGPDFArrayGetObject(rectArray, k, &rectObj)) { return; } CGPDFReal coord; if(!CGPDFObjectGetValue(rectObj, kCGPDFObjectTypeReal, &coord)) { return; } coords[k] = coord; } char *uriString = (char *)CGPDFStringGetBytePtr(uriStringRef); NSString *uri = [NSString stringWithCString:uriString encoding:NSUTF8StringEncoding]; CGRect rect = CGRectMake(coords[0],coords[1],coords[2],coords[3]); CGPDFInteger pageRotate = 0; CGPDFDictionaryGetInteger( pageDictionary, "Rotate", &pageRotate ); CGRect pageRect = CGRectIntegral( CGPDFPageGetBoxRect( page, kCGPDFMediaBox )); if( pageRotate == 90 || pageRotate == 270 ) { CGFloat temp = pageRect.size.width; pageRect.size.width = pageRect.size.height; pageRect.size.height = temp; } rect.size.width -= rect.origin.x; rect.size.height -= rect.origin.y; CGAffineTransform trans = CGAffineTransformIdentity; trans = CGAffineTransformTranslate(trans, 0, pageRect.size.height); trans = CGAffineTransformScale(trans, 1.0, -1.0); rect = CGRectApplyAffineTransform(rect, trans); // do whatever you need with the coordinates. // e.g. you could create a button and put it on top of your page // and use it to open the URL with UIApplication's openURL } }
heres the basic idea to get to the annots CGPDFDictionary for each page atleast. after that you should be able to figure it out with help from the PDF spec from Adobe.
1.) get the CGPDFDocumentRef.
2.) get each page.
3.) on each page, use CGPDFDictionaryGetArray(pageDictionary, "Annots", &outputArray)
where pageDictionary is the CGPDFDictionary representing the CGPDFPage, and outputArray is the variable (CGPDFArrayRef) to store the Annots array of that page in.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With