Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to replace/delete text from a pdf using python?

I have code that hides parts of the pdf (by just covering it with a white polygon) but the issue with this is, the text is still there, if you ctrl-f you can still find it.

My goal is to actually remove the text from the pdf itself. Using pdfminer I managed to extract the text from the pdf but I don't know if its possible to actually "replace" the text with say just some empty spaces. Is such a thing possible using python? Extracting it isn't enough. I need the text to be removed from the PDF

like image 423
Wallace Avatar asked Sep 15 '18 17:09

Wallace


People also ask

How do you remove text from a PDF in Python?

extractText()) Extract text from the PDF page. pdfFileObj. close() Close the PDF file object. The replacement text would simply be "", as you want to remove all instances / cases of a certain piece of text.

How do I remove unwanted text from a PDF?

Erase Text in PDFClick on the "Edit" tab on the top right to enable the editing mode. Then click on the text block you want to delete. You can either use the "Backspace" key or press the "Delete" button from your keyboard.

Can you replace text in PDF?

On the PDF file, press “Ctrl+F” on your keyboard and input the text you would like to be replaced. Then type in new text in the input field of Replace to modify the current one to this new text. Click on “Replace” to start replacing PDF texts.


1 Answers

This is kind of memory intensive but you can copy the rest of the pdf apart from the part you are removing and then overwrite the file with the new version which does not contain the part you wish to remove. You can do this using PyPDF by retrieving a content stream and finding and removing the relevant parts.

PyPDF docs https://pythonhosted.org/PyPDF2/PageObject.html?highlight=getcontents#PyPDF2.pdf.PageObject.getContents;

PDF standard https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf pg 78, pg 81;

like image 151
Xander Bielby Avatar answered Sep 20 '22 13:09

Xander Bielby