Is possible to use iTextSharp to remove from a PDF document objects that are not visible (or at least not being displayed)?
More details:
1) My source is a PDF page containing images and text (maybe some vectorial drawings) and embedded fonts.
2) There's an interface to design multiple 'crop boxes'.
3) I must generate a new PDF that contains only what is inside the crop boxes. Anything else must be removed from resulting document (indeed I may accept content which is half inside and half outside, but this is not the ideal and it should not appear anyway).
My solution so far:
I have successfully developed a solution that creates new temporary documents, each one containing the content of each crop box (using writer.GetImportedPage and contentByte.AddTemplate to a page that is exactly the size of the crop box). Then I create the final document and repeat the process, using the AddTemplate method do position each "cropped page" in the final page.
This solution has 2 big disadvantages:
So, I think I need to iterate through PDF objects, detect if it is visible or not, and delete it. At the time of writing, I am trying to use pdfReader.GetPdfObject.
Thanks for the help.
If the PDF which you are trying is a template/predefined/fixed then you can remove that object by calling RemoveField.
PdfReader pdfReader = new PdfReader(../Template_Path.pdf"));
PdfStamper pdfStamperToPopulate = new PdfStamper(pdfReader, new FileStream(outputPath, FileMode.Create));
AcroFields pdfFormFields = pdfStamperToPopulate.AcroFields;
pdfFormFields.RemoveField("fieldNameToBeRemoved");
PdfReader pdfReader = new PdfReader(../Template_Path.pdf"));
PdfStamper pdfStamperToPopulate = new PdfStamper(pdfReader, new FileStream(outputPath, FileMode.Create));
AcroFields pdfFormFields = pdfStamperToPopulate.AcroFields;
pdfFormFields.RemoveField("fieldNameToBeRemoved");
Yes, it's possible. You need to parse pdf page content bytes to PdfObjects, store them to the memory, delete unvanted PdfObject's, build Pdf content from PdfObject's back to pdf content bytes, replace page content in PdfReader just before you import the page via PdfWriter.
I would recommend you to check out this: http://habjan.blogspot.com/2013/09/proof-of-concept-converting-pdf-files.html
Sample from the link implements Pdf content bytes parsing, building back from PdfObjec's, replacing PdfReader page content bytes...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With