Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: How to replace text in pdf

I have a pdf file and i want to replace some text in pdf file and generate new pdf. How can i do that in python? I have tried reportlab , reportlab does not have any fucntion to search text and replace it. What other module can i use?

like image 851
Dax Amin Avatar asked Jul 29 '15 14:07

Dax Amin


People also ask

How do I replace text in a PDF in Python?

By inserting page[NameObject("/Contents")] = contents. decodedSelf before writer. addPage(page) , we force pyPDF2 to update content of the page object. This way I was able to overcome this problem and replace text from pdf file.

How do I replace text in a PDF document?

On the PDF file, press “Ctrl+F” on your keyboard and input the text you would like to be replaced. Then type in new text in the input field of Replace to modify the current one to this new text. Click on “Replace” to start replacing PDF texts.

How do you remove text from a PDF in Python?

extractText()) Extract text from the PDF page. pdfFileObj. close() Close the PDF file object. The replacement text would simply be "", as you want to remove all instances / cases of a certain piece of text.


1 Answers

You can try Aspose.PDF Cloud SDK for Python, Aspose.PDF Cloud is a REST API PDF Processing solution. It is paid API and its free package plan provides 50 credits per month.

I'm developer evangelist at Aspose.

import os
import asposepdfcloud
from asposepdfcloud.apis.pdf_api import PdfApi

# Get App key and App SID from https://cloud.aspose.com
pdf_api_client = asposepdfcloud.api_client.ApiClient(
    app_key='xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
    app_sid='xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxxx')

pdf_api = PdfApi(pdf_api_client)
filename = '02_pages.pdf'
remote_name = '02_pages.pdf'
copied_file= '02_pages_new.pdf'
#upload PDF file to storage
pdf_api.upload_file(remote_name,filename)

#upload PDF file to storage
pdf_api.copy_file(remote_name,copied_file)

#Replace Text
text_replace = asposepdfcloud.models.TextReplace(old_value='origami',new_value='polygami',regex='true')
text_replace_list = asposepdfcloud.models.TextReplaceListRequest(text_replaces=[text_replace])

response = pdf_api.post_document_text_replace(copied_file, text_replace_list)
print(response)
like image 195
Tilal Ahmad Avatar answered Sep 19 '22 11:09

Tilal Ahmad