there are some keywords I am gotten before and I want to search on pdf document via python and highlight them. Is it viable with some library like pdfMiner?
Yes, you can use 'PyMuPDF' library. pip install PyMuPDF.
Then use the following code,
import fitz
### READ IN PDF
doc = fitz.open(r"D:\XXXX\XXX.pdf")
page = doc[0]
text = "Amey"
text_instances = page.searchFor(text)
### HIGHLIGHT
for inst in text_instances:
print(inst, type(inst))
highlight = page.addHighlightAnnot(inst)
### OUTPUT
doc.save(r"D:\XXXX\XXX.pdf", garbage=4, deflate=True, clean=True)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With