I want to use the pdfminer for extracting the text info. I have downloaded the pdfminer-20131113. I have installed the python in C:\python34
.
Now using cmd, I am setting the path to the setup.py
file of pdfminer.
and running the following command.
python setup.py install
But I am getting the below error.
> D:\pdfminer-20101226>python setup.py install
Traceback (most recent call last):
File "setup.py", line 3, in <module>
from pdfminer import __version__
File "D:\pdfminer-20101226\pdfminer\__init__.py", line 4
if __name__ == '__main__': print __version__
^
SyntaxError: invalid syntax
It seems to be some error in the setup.py file of pdfminer, which I am not sure how to resolve.
Also, I saw a pdf2txt.py file in the build folder of pdfminer. I tried to use that also as pdf2txt.py -o output.html pdffilename.pdf
(with full path). but instead of converting it. it opens the pdf2txt.py
file.
The PDFMiner project homepage states:
Written entirely in Python. (for version 2.4 or newer)
and further down:
Install Python 2.4 or newer. (Python 3 is not supported.)
so you'll have to install Python 2 to run this project.
Alternatively, you could try the Python 3 port, pdfminer3k
; it hasn't seen any updates in 20 months, while PDFMiner does have more recent releases, so your mileage may vary.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With