Extract comments from pdf

Question

I have a collection of .pdf files with comments that were added in Adobe Acrobat. I would like to be able to analyze these comments, but I'm kind of stuck on extracting them. I've looked at the pdftools package, but it seems to only be able to extract the text and not the comments. Is there a method available for extracting the comments within R?

Bernuly · Accepted Answer

PyMuPDF (https://pymupdf.readthedocs.io/en/latest/) is the only python library I have found working.

Installation in Debian/Ubuntu-based distributions:

apt-get install python3-fitz

Script:

import fitz
doc = fitz.open("example.pdf")
for i in range(doc.pageCount):
  page = doc[i]
  for annot in page.annots():
    print(annot.info["content"])

Extract comments from pdf

Tags:

r

pdf

Robert Bradford

1 Answers

Bernuly

Recent Activity

Donate For Us

Extract comments from pdf

Tags:

r

pdf

Robert Bradford

1 Answers

Bernuly

Related questions

Recent Activity

Donate For Us