Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PyPDF2 to extract vertical text from scanned pdf

I am trying to extract text from the scanned pdf using PyPDF2. Some of the pdf contains text aligned vertically. But the orientation of the page is Portrait. Is there any way to identify if the text is vertically aligned and read vertical lines in PDF using pdfminer or PyPDF2

like image 498
Mms Avatar asked Nov 17 '22 01:11

Mms


1 Answers

There is no way to do this with PyPDF2 at the moment (I'm the maintainer of PyPDF2).

See also: https://github.com/py-pdf/PyPDF2/issues/1071

like image 143
Martin Thoma Avatar answered Dec 10 '22 14:12

Martin Thoma