I trying to get the contents of a PDF annotation to string so I can store that information in a database for searching purposes.
Does anyone know how to accomplish this using iText/iTextSharp?
Yes, but the specifics really depend on what kind[s] of annotations you're talking about.
In general:
PdfDictionary pageDict = myPdfReader.getPageN(firstPageIsOne);
PdfArray annotArray = pageDict.getAsArray(PdfName.ANNOTS);
for (int i = 0; i < annotArray.size(); ++i) {
PdfDictionary curAnnot = annotArray.getAsDict(i);
int someType = myCodeToGetAnAnnotsType(curAnnot);
if (someType == THIS_TYPE) {
writeThisType(curAnnot);
} else if (someType == THAT_TYPE) {
writeThatType(curAnnot);
}
}
For details, you'll need to examine the PDF Specification, in particular the annotation descriptions: "Chapter 12.5.6 Annotation Types".
If you can tell us what types you care about, I can be of more help.
For future reference to anyone that finds this question via Google like I did...
If what you want to do is find sticky note annotations name and contents you can do this (based in part on Mark's answer)
PdfReader reader = new PdfReader(somePDF);
PdfDictionary pageDict = reader.GetPageN(1);
PdfArray annotArray = pageDict.GetAsArray(PdfName.ANNOTS);
for (int i = 0; i < annotArray.Size; ++i)
{
PdfDictionary curAnnot = annotArray.GetAsDict(i);
PdfString name = curAnnot.GetAsString(PdfName.T);
PdfString contents = curAnnot.GetAsString(PdfName.CONTENTS);
if (!string.IsNullOrWhiteSpace(name?.ToString()))
{ Console.WriteLine(name); }
if (!string.IsNullOrWhiteSpace(contents?.ToString()))
{ Console.WriteLine(contents); }
}
Additionally, to help identify what things you might be looking for you can open a PDF in a text editor and look for /annot and you'll quickly find your annotation object.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With