Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can't install textract on windows

I've tried lots of things but still fail when I'm trying to install textract package on my Windows by using pip command.

I'm getting the following error:

error

I have no idea what to do, so I'll be really grateful for any advice. Thank you

like image 399
Sebastian Wdowiarz Avatar asked Jun 07 '18 14:06

Sebastian Wdowiarz


2 Answers

Stolen from here:

Needed to first install swig from conda (miniconda)

conda install swig

Then downloaded the EbookLib 0.15 zip from the releases

https://github.com/aerkalov/ebooklib/releases

After unzipping it, I manually removed (I used notepad++) the unicode char in the README.md file. (unicode char is on Line 44)

And then installed the module with pip.

cd to_unzipped_folder_path_here
pip install .

And finally

pip install textract
like image 147
Marcus Mann Avatar answered Oct 04 '22 23:10

Marcus Mann


(Windows 10, Python 3.7) I had more issues than others, but this builds off of previous answers :

  1. Make sure that Microsoft Visual Studio C++ Compiler for Python is installed

    • For Visual Studio C++ 14.0 (also required by Scrapy as of June 2019), use : https://wiki.python.org/moin/WindowsCompilers -->
      https://visualstudio.microsoft.com/downloads/#build-tools-for-visual-studio-2017 --> https://visualstudio.microsoft.com/thank-you-downloading-visual-studio/?sku=Community&rel=16 Note : This may take a very long time to install, so be patient
  2. python -m pip install --upgrade pip setuptools wheel

  3. pip install six --upgrade

  4. Download EbookLib version 0.15:

    • Unzip the .zip file To avoid encoding errors, edit the "long_description" variable assignment to be "long_description = open('README.md',encoding="utf-8").read(),"
  5. Download Swig:

    • http://www.swig.org/download.html
    • Unzip the .zip file
    • Copy the swig.exe file into the Python path : e.g. "C:\Users\username\AppData\Local\Programs\Python\Python37"
    • Copy the "typemaps" folder into the python "Lib" folder : e.g. "C:\Program Files\swigwin-4.0.0\Lib\typemaps" --> "C:\Users\username\AppData\Local\Programs\Python\Python37\Lib\"
    • Copy the "*.swg" files to the python "Lib" folder : e.g. "C:\Program Files\swigwin-4.0.0\Lib*.swg" --> "C:\Users\username\AppData\Local\Programs\Python\Python37\Lib\"
    • Copy the all swig python files to the python "Lib" folder : e.g. "C:\Program Files\swigwin-4.0.0\Lib\python*" --> "C:\Users\username\AppData\Local\Programs\Python\Python37\Lib\"
  6. cd into the unzipped Ebooklib folder from the prompt : e.g. C:> cd "C:\Users\username\Desktop\ebooklib-0.15"

  7. run the installation for EbookLib : pip install .

  8. run the textract installation : pip install textract

The output should be :

C:\Users\username\Desktop\ebooklib-0.15>pip install textract
Collecting textract
Requirement already satisfied: docx2txt==0.6 in c:\users\username\appdata\local\programs\python\python37\lib\site-packages (from textract) (0.6)
Requirement already satisfied: beautifulsoup4==4.5.3 in c:\users\username\appdata\local\programs\python\python37\lib\site-packages (from textract) (4.5.3)
Requirement already satisfied: EbookLib==0.15 in c:\users\username\appdata\local\programs\python\python37\lib\site-packages (from textract) (0.15)
Requirement already satisfied: xlrd==1.0.0 in c:\users\username\appdata\local\programs\python\python37\lib\site-packages (from textract) (1.0.0)
Requirement already satisfied: SpeechRecognition==3.6.3 in c:\users\username\appdata\local\programs\python\python37\lib\site-packages (from textract) (3.6.3)
Requirement already satisfied: six==1.10.0 in c:\users\username\appdata\local\programs\python\python37\lib\site-packages (from textract) (1.10.0)
Collecting pocketsphinx==0.1.3 (from textract)
  Using cached https://files.pythonhosted.org/packages/93/5f/a968e5d53d25e32deb78c3e169fd8612ecf53cc76e32cb40e19be35696af/pocketsphinx-0.1.3.tar.bz2
Requirement already satisfied: chardet==2.3.0 in c:\users\username\appdata\local\programs\python\python37\lib\site-packages (from textract) (2.3.0)
Requirement already satisfied: argcomplete==1.8.2 in c:\users\username\appdata\local\programs\python\python37\lib\site-packages (from textract) (1.8.2)
Requirement already satisfied: python-pptx==0.6.5 in c:\users\username\appdata\local\programs\python\python37\lib\site-packages (from textract) (0.6.5)
Requirement already satisfied: lxml in c:\users\username\appdata\local\programs\python\python37\lib\site-packages (from EbookLib==0.15->textract) (4.3.3)
Requirement already satisfied: XlsxWriter>=0.5.7 in c:\users\username\appdata\local\programs\python\python37\lib\site-packages (from python-pptx==0.6.5->textract) (1.1.8)
Requirement already satisfied: Pillow>=2.6.1 in c:\users\username\appdata\local\programs\python\python37\lib\site-packages (from python-pptx==0.6.5->textract) (6.0.0)
Building wheels for collected packages: pocketsphinx
  Building wheel for pocketsphinx (setup.py) ... done
  Stored in directory: C:\Users\username\AppData\Local\pip\Cache\wheels\38\80\4f\ddc3e8c2b788f2c7f1d625ae870f6bafd3038ff04a3445a2f8
Successfully built pocketsphinx
Installing collected packages: pocketsphinx, textract
Successfully installed pocketsphinx-0.1.3 textract-1.6.1

C:\Users\username\Desktop\ebooklib-0.15>

At the time of this writing, jsonschema will have conflicting dependencies with textract. The following errors also arose as I tried to figure out the proper installation :

ERROR: requests 2.22.0 has requirement chardet<3.1.0,>=3.0.2, but you'll have chardet 2.3.0 which is incompatible.
ERROR: camelot-py 0.7.2 has requirement chardet>=3.0.4, but you'll have chardet 2.3.0 which is incompatible.

ERROR: Command "python setup.py egg_info" failed with error code 1 in C:\Users\username\AppData\Local\Temp\pip-install-msmb9od3\EbookLib\
    UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 1671: character maps to <undefined>
error: command 'C:\\Users\\username\\AppData\\Local\\Programs\\Python\\Python37\\swig.exe' failed with exit status 1

ERROR: Failed building wheel for pocketsphinx
error: command 'swig.exe' failed: No such file or directory
  (1) : Error: Unable to find 'swig.swg'
  (3) : Error: Unable to find 'python.swg'
like image 41
Torc Avatar answered Oct 05 '22 00:10

Torc