Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to search a PDF in Acrobat Reader AND jump to a certain page via parameter?

We are using lucene within a web application to search in a great number of PDF documents.

The workflow is like this:

  1. A user enters a search term

  2. A list of search results is presented to the user.

  3. Each search result represents one PDF document and shows the user on which page the search term was found. Each of these pages is represented as a hyperlink.

  4. If the user now clicks on such a hyperlink, he directly jumps to that page.

  5. But now the user has the problem that the search term isn't highlighted on the page. Therefore the user has to look on his own to find the search term on the page.

What we wanted is a way to highlight the search term on the specific page in the PDF.

The open parameters for Acrobat Reader allow for either searching a PDF document (with hit highlighting) OR jumping to a specific page. But the combination of both parameters - which we would need - doesn't work.

Does anyone have an idea how jumping to a page and highlighting a search term in a PDF document could work? I had a look at the Acrobat SDK but don't see how we can use it (it's terribly documented).

like image 1000
agez Avatar asked Nov 05 '22 13:11

agez


1 Answers

acrobat uses a plugin to hilite terms, and requires a fdf stream to indicate the words to hilite. See here for pointers:

support.dtsearch.com/dts0152.htm

update:

assuming you know the page# and word# on the page to hilight, here is one way to do it:

On web page:

<iframe id="acroframe" src="pdfpage/example.pdf#xml=http://example.com/hilite.aspx?hilite=8e3302ee-ff88-41ee-bdfb-9e8df87cc3ad&toolbar=1&navpanes=0&statusbar=0&view=FitH">
</iframe>

The PDF will appear in the frame, it will show the toolbar, hide the navpane & status bars and fit page to horizontal. Then it will query the web site to get the xfdf data for hilighting: http://example.com/hilite.aspx?hilite=8e3302ee-ff88-41ee-bdfb-9e8df87cc3ad

Here I used a guid key that I previously saved in the session with the hilite xfdf value. The hilite.aspx page will return something like the following to hilite words in the document:

<XML>
<Body units=characters color=#ff00ff mode=active version=2>
<Highlight>
<loc pg=15 pos=3583 len=5>
</Highlight>
</Body>
</XML>

This will hilight 5 chars on page 15 starting at position 3583. (note: xfdf is not real "XML" despite the similarity)

Note that acrobat reader will have to have the "Enable search highlights from external highlight server" option checked in preferences.

like image 71
mosheb Avatar answered Nov 15 '22 05:11

mosheb