Remove some images and text objects from pdf

Question

I have a pdf page object with an image and a lot of text.

I want to remove that image and remove some text objects based on their contents. That is I want to get all text objects' contents, then remove some of them if they satisfied the condition.

How can I do that with PyPDF2? Or is there another library which allows doing that?

R‌‌‌.. · Accepted Answer

To remove all images from a PDF file using PyPDF2 you can do:

from PyPDF2 import PdfFileWriter, PdfFileReader

inputStream = open("src.pdf", "rb")
outputStream = open("dst.pdf", "wb")

src = PdfFileReader(inputStream)
output = PdfFileWriter()

[output.addPage(src.getPage(i)) for i in range(src.getNumPages())]
output.removeImages()

output.write(outputStream)

Remove some images and text objects from pdf

Tags:

python

pdf

pypdf2

sshilovsky

1 Answers

R‌‌‌..

Recent Activity

Donate For Us

Remove some images and text objects from pdf

Tags:

python

pdf

pypdf2

sshilovsky

1 Answers

R‌‌‌..

Related questions

Recent Activity

Donate For Us