Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding hyperlinks inside a PDF document?

Tags:

c#

asp.net

pdf

I'm currently using Aspose PDF Kit to split a 'master PDF' up into individual documents + thumbnails. This works well at the moment, but the device I'll be rendering the PDF on won't know about the annotations/links within the PDF.

I understand there is a way to parse the PDF document to detect the X/Y position of a hyperlink etc, is there an simple way to extract/iterate across the document data so I can write it to an external XML file?

like image 973
Jamie Chapman Avatar asked Feb 07 '11 14:02

Jamie Chapman


1 Answers

You may want to try Docotic.Pdf library for this (disclaimer: I work for Bit Miracle).

The library can be used to retrieve all hyperlinks in a document. You may retrieve bounding box, text and other properties of a link, too.

Please take a look at "Extract text from link target" sample. It may help you to get started.

like image 173
Bobrovsky Avatar answered Oct 06 '22 00:10

Bobrovsky