A friend and I are interested in training the tesseract-OCR engine for a CV project. We tried using some wrappers such as PyTesser and pyocr, but the results are currently not as accurate as we need them to be. As such, we want to try training the tesseract to perform better for our purposes (i.e. identifying text on food labels), but are having some trouble installing the training tools. What we've tried: Looking on the google code website, the 'Compiling' page on the tesseract's google code wiki says the training tools are only available on version 3.03. However, the google code 'Downloads' page for tesseract-ocr only has the materials for 3.02. The bottom of the 'Compiling' page also has some comments about installing version 3.03 on Windows and OSX, but no comments yet for Linux users. There also appears to be some sort of 3.03 source package for Ubuntu but we're not sure how to access it on our computers and the 'Compiling' page says we need to run these commands: <pre class="prettyprint"><code>make training sudo make training-install </code></pre> We've also found a google group thread about tesseract 3.03 but again it seems like these posts do not include advice for Linux users (unless we missed something during the initial read). Is this actually a really simple command-line install problem? Or, is there a way train tesseract with 3.02 (which we currently have installed)? Have we been looking at the wrong places for information? Any advice or links to instructions for installing tesseract-ocr 3.03 for Linux distributions would be greatly appreciated! Thanks.

Tesseract can directly be installed in Ubuntu 14.04 using <pre class="prettyprint"><code>sudo apt-get install tesseract-ocr </code></pre> I don't have any idea if you can do it in older version of Ubuntu because the repo might be updated in later version of Ubuntu.

I had an aws ubuntu 14.04 instance. when I tried installing Tesseract with <pre class="prettyprint"><code>sudo apt-get install tesseract-ocr </code></pre> It retuned package not found But this worked for me. <pre class="prettyprint"><code>sudo apt-get update sudo apt-get install tesseract-ocr </code></pre>

How does one install Tesseract-OCR 3.03 in Ubuntu/Linux distributions?

Tags:

linux

ubuntu

ocr

tesseract

A friend and I are interested in training the tesseract-OCR engine for a CV project. We tried using some wrappers such as PyTesser and pyocr, but the results are currently not as accurate as we need them to be. As such, we want to try training the tesseract to perform better for our purposes (i.e. identifying text on food labels), but are having some trouble installing the training tools.

What we've tried:

Looking on the google code website, the 'Compiling' page on the tesseract's google code wiki says the training tools are only available on version 3.03. However, the google code 'Downloads' page for tesseract-ocr only has the materials for 3.02. The bottom of the 'Compiling' page also has some comments about installing version 3.03 on Windows and OSX, but no comments yet for Linux users.

There also appears to be some sort of 3.03 source package for Ubuntu but we're not sure how to access it on our computers and the 'Compiling' page says we need to run these commands:

make training
sudo make training-install

We've also found a google group thread about tesseract 3.03 but again it seems like these posts do not include advice for Linux users (unless we missed something during the initial read).

Is this actually a really simple command-line install problem? Or, is there a way train tesseract with 3.02 (which we currently have installed)? Have we been looking at the wrong places for information?

Any advice or links to instructions for installing tesseract-ocr 3.03 for Linux distributions would be greatly appreciated! Thanks.

582

asked Jun 13 '14 20:06

greenteawarrior

2 Answers

Tesseract can directly be installed in Ubuntu 14.04 using

sudo apt-get install tesseract-ocr

I don't have any idea if you can do it in older version of Ubuntu because the repo might be updated in later version of Ubuntu.

178

answered Nov 15 '22 20:11

erluxman

I had an aws ubuntu 14.04 instance. when I tried installing Tesseract with

sudo apt-get install tesseract-ocr

It retuned package not found

But this worked for me.

sudo apt-get update
sudo apt-get install tesseract-ocr

answered Nov 15 '22 20:11

Venkatesh Mondi

Related questions
                            
                                rsync over SSH preserve ownership only for www-data owned files
                            
                                Not authorized for query on admin.system.namespaces on mongodb
                            
                                How to use PTRACE to get a consistent view of multiple threads?
                            
                                How to search Linux man pages (e.g. with grep)
                            
                                SocketServer.ThreadingTCPServer - Cannot bind to address after program restart
                            
                                Tracert on Windows Returns Slower than on Linux
                            
                                How to check heap size for a process on Linux
                            
                                Sending a password over SSH or SCP with subprocess.Popen
                            
                                Android Emulator Device crash when start in Android Studio (Linux)
                            
                                What is the use of "-u" option in cat command? [closed]
                            
                                Which Linux distribution is best for developing a Mono application in a virtual machine?
                            
                                unistd.h and c99 on Linux
                            
                                Run a shell command when a file is added
                            
                                grunt server can't be connected <gruntjs>
                            
                                What error code does a process that segfaults return? [duplicate]
                            
                                Linux get the size of a folder and its subfolders to a certain depth
                            
                                How to grep for the exact word if the string has got dot in it
                            
                                Print dates in date range linux
                            
                                Manually merge two files using diff
                            
                                grep command to add end line after every match [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With