Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Syntax error while installing pdfminer using python

I want to use the pdfminer for extracting the text info. I have downloaded the pdfminer-20131113. I have installed the python in C:\python34. Now using cmd, I am setting the path to the setup.py file of pdfminer. and running the following command.

python setup.py install

But I am getting the below error.

> D:\pdfminer-20101226>python setup.py install
Traceback (most recent call last):
  File "setup.py", line 3, in <module>
    from pdfminer import __version__
  File "D:\pdfminer-20101226\pdfminer\__init__.py", line 4
    if __name__ == '__main__': print __version__
                                               ^
SyntaxError: invalid syntax

It seems to be some error in the setup.py file of pdfminer, which I am not sure how to resolve.

Also, I saw a pdf2txt.py file in the build folder of pdfminer. I tried to use that also as pdf2txt.py -o output.html pdffilename.pdf (with full path). but instead of converting it. it opens the pdf2txt.py file.

like image 448
Maverick Avatar asked Dec 12 '22 06:12

Maverick


2 Answers

The PDFMiner project homepage states:

Written entirely in Python. (for version 2.4 or newer)

and further down:

Install Python 2.4 or newer. (Python 3 is not supported.)

so you'll have to install Python 2 to run this project.

Alternatively, you could try the Python 3 port, pdfminer3k; it hasn't seen any updates in 20 months, while PDFMiner does have more recent releases, so your mileage may vary.

like image 137
Martijn Pieters Avatar answered Feb 14 '23 14:02

Martijn Pieters


This should solve your problem in Python 3

pip install pdfminer.six
like image 45
Sagun Shrestha Avatar answered Feb 14 '23 13:02

Sagun Shrestha