Currently, I have a series of images (PNGs) and, for each, an unformatted text version of their content. I'd like to make a PDF where each image becomes a full page of the resulting PDF, with the corresponding text somehow also attached to the page, so that searching for some words brings you to pages with that text on it, even though the text is never directly displayed.
This is a one-shot job, so it doesn't have to be neat or scalable. I could use any language commonly available on a Linux system, or common command-line tools. (I also have a Windows system with Acrobat available, though there are near a thousand images, so something manual wouldn't work.)
One option to try would be to generate a PDF using Java and Apache-Fop, but that might be more work than you're looking to do.
You might do better with iText; Example of adding PNG to iText to generate PDF
You will need to determine how to generate a Layer
in which to place your searchable text; I am unable to advise you on how to do this step.
Here is how you can tell if a PDF contains text, which might help you with building one.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With