PyPDF2 to extract vertical text from scanned pdf

Question

I am trying to extract text from the scanned pdf using PyPDF2. Some of the pdf contains text aligned vertically. But the orientation of the page is Portrait. Is there any way to identify if the text is vertically aligned and read vertical lines in PDF using pdfminer or PyPDF2

Martin Thoma · Accepted Answer

There is no way to do this with PyPDF2 at the moment (I'm the maintainer of PyPDF2).

See also: https://github.com/py-pdf/PyPDF2/issues/1071

PyPDF2 to extract vertical text from scanned pdf

Tags:

python

python-3.x

pypdf2

pdfminer

pdf-extraction

Mms

1 Answers

Martin Thoma

Recent Activity

Donate For Us

PyPDF2 to extract vertical text from scanned pdf

Tags:

python

python-3.x

pypdf2

pdfminer

pdf-extraction

Mms

1 Answers

Martin Thoma

Related questions

Recent Activity

Donate For Us