I want to use the pdfminer for extracting the text info. I have downloaded the pdfminer-20131113. I have installed the python in C:\python34
.
Now using cmd, I am setting the path to the setup.py
file of pdfminer.
and running the following command.
python setup.py install
But I am getting the below error.
> D:\pdfminer-20101226>python setup.py install
Traceback (most recent call last):
File "setup.py", line 3, in <module>
from pdfminer import __version__
File "D:\pdfminer-20101226\pdfminer\__init__.py", line 4
if __name__ == '__main__': print __version__
^
SyntaxError: invalid syntax
It seems to be some error in the setup.py file of pdfminer, which I am not sure how to resolve.
Also, I saw a pdf2txt.py file in the build folder of pdfminer. I tried to use that also as pdf2txt.py -o output.html pdffilename.pdf
(with full path). but instead of converting it. it opens the pdf2txt.py
file.
The PDFMiner project homepage states:
Written entirely in Python. (for version 2.4 or newer)
and further down:
Install Python 2.4 or newer. (Python 3 is not supported.)
so you'll have to install Python 2 to run this project.
Alternatively, you could try the Python 3 port, pdfminer3k
; it hasn't seen any updates in 20 months, while PDFMiner does have more recent releases, so your mileage may vary.
This should solve your problem in Python 3
pip install pdfminer.six
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With