I am using Tesseract OCR for my program and I am going to convert it into a single .exe file using pyinstaller. The problem is that in order for Tesseract to work, I need to reference the path to the program installed on my computer, like this: pytesseract.pytesseract.tesseract_cmd = 'E:\\Tesseract-OCR\\tesseract'
Since this is not just a separate library that can be imported, but a standalone program, I can't pass it to pyinstaller as an '--add_data' argument. How do I make a one-file executable then?
Assuming you're on Windows, I ran into this problem and think I solved it by compiling a static version of tesseract (which does not need to be installed) and including its path as a binary in the pyinstaller spec file.
Official compiling instructions here:
https://tesseract-ocr.github.io/tessdoc/Compiling.html#windows
Install MS Visual Studio 15 (with c++) and vcpkg and execute one of the following through command prompt:
for 64-bit: vcpkg install tesseract:x64-windows-static
for 32-bit: vcpkg install tesseract:x86-windows-static
The tesseract executable will be located a few subfolders within the vcpkg folder on your PC. With that file, you also need to download a .trainneddata file and place it within a folder called 'tessdata' in the same directory with the tesseract exe.
Create a pyinstaller spec file and edit the Analysis(binaries=[]) section to include the folder path where tesseract is located (if you're not using a subfolder for tesseract I think you'd need to add both tesseract.exe and the tessdata subfolder). I also changed inclide_binaries=True
Run pyinstaller and include the option --specpath 'yourspecfile.spec'
I haven't yet attempted to try it on a different PC, so haven't fully tested that it works as intended (I don't know anything about compiling c++, there may be additional files/links needed for tesseract that are still intact since I've only been testing on the build PC)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With