I have requirement to read a pdf file and search for a text. I should display in which page that text exist and the number of occurances. I can read the pdf to text but i need to know the page number.
Thanks
You can use Docotic.Pdf for this (I work for Bit Miracle).
Here is a sample for how to search text in PDF:
PdfDocument doc = new PdfDocument("file.pdf");
string textToSearch = "some text";
for (int i = 0; i < doc.Pages.Count; i++)
{
string pageText = doc.Pages[i].GetText();
int count = 0;
int lastStartIndex = pageText.IndexOf(textToSearch, 0, StringComparison.CurrentCultureIgnoreCase);
while (lastStartIndex != -1)
{
count++;
lastStartIndex = pageText.IndexOf(textToSearch, lastStartIndex + 1, StringComparison.CurrentCultureIgnoreCase);
}
if (count != 0)
Console.WriteLine("Page {0}: '{1}' found {2} times", i, textToSearch, count);
}
You may want to remove third argument for IndexOf
method if you want to perform case-sensitive search.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With