Does anyone know of a PDF file parser that I could use to pull out sections of text from the plaintext pdf file? Specifially I want a way to be able to reliably pull out the section of text specific to annotations?
Delphi, C# RegEx I dont mind.
The PDF File Parser article on xactpro seems to be exactly what you need. It explains the format of the PDF and comes with full source code for a parser (and another project for visualisation of the model).
The parser uses format-specific terms, but you could easily use the visualiser to learn what to look for.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With