I have a requirement to split a large pdf document into smaller files based on the content of the file. We use BCL easyPDF to manipulate pdf files. easyPDF can split pdf documents based on a page number, but it cannot split the document based on the file content. Also it does not have a search function (as far as I can tell, if I am wrong please someone let me know.) to determine the location of the content.
Now can someone tell me how I can find the location of text in a pdf file using .net?
Thanks
We're sorry for the trouble you had with Adobe Reader, please reboot the machine once and navigate to Adobe Reader's preferences from Edit>Preferences>Security(Enhanced)>and try disabling 'Enable Protected Mode at startup'>Click OK and restart the application and check.
A PDF Parser (also sometimes called PDF scraper) is a software that can be used to extract data from PDF documents. PDF Parsers can come in form of libraries for developers or as standalone software products for end-users. PDF Parsers are used mainly to extract data from a batch of PDF files.
Adobe Acrobat ReaderOn the left side of the Preferences dialog, select Documents under Categories. Then, check the Restore last view settings when reopening documents checkbox. Now, when you reopen any PDF file, Acrobat Reader will jump to the page you were viewing when you last closed the file.
You might try Docotic.Pdf library for your task.
The library can extract text from PDFs (with or without formatting).
Or you could just retrieve a collection of words with their bounding rectangles from PDFs. This should help you to find location of the text in a file.
Disclaimer: I work for the vendor of the library.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With