Disable dictionary in Tesseract

1 Answers

Try to set these variables (put them in a config file) to false:

load_system_dawg 
load_freq_dawg
load_punc_dawg
load_number_dawg
load_unambig_dawg
load_bigram_dawg
load_fixed_length_dawgs

https://groups.google.com/forum/?fromgroups=#!searchin/tesseract-ocr/Disable$20dictionary$20in$20Tesseract/tesseract-ocr/5nvIo1DJxHE/f3gBi2pTKykJ

Also read How to increase the trust in/strength of the dictionary? in the FAQ. From it:

For tesseract-ocr < 3.01 try upping NON_WERD and GARBAGE_STRING in dict/permute.cpp to maybe 3 or even 5.

For tesseract-ocr >= 3.01 try increasing the variables language_model_penalty_non_freq_dict_word and language_model_penalty_non_dict_word in a config file. By default they are 0.1 and 0.15 respectively.

185

answered Oct 06 '22 13:10

nguyenq

Related questions
                            
                                How to get public IP addresses of all nodes in an Amazon EC2 cluster? [closed]
                            
                                How to configure command line git to use ssh key
                            
                                Get the path of the current TTY in Python
                            
                                "language_model_penalty_non_dict_word" has no effect in tesseract 3.01
                            
                                Meaning of "*:" in java classpath specification
                            
                                How can I get the second column of a very large csv file using linux command?
                            
                                How to pass env var to Docusaurus v2
                            
                                command line to get the memory used by process
                            
                                Perl Getopt::Long Related Question - Mutually Exclusive Command-Line Arguments
                            
                                How do I run phpunit command in Windows command line no matter what directory I am in?
                            
                                zsh: strftime command not found
                            
                                Calling command prompt from Qt application without freezing?
                            
                                Linux: how to set up a timezone of a process?
                            
                                PDF to tiff ImageMagick problem
                            
                                How to initiate standby/sleep from command-line?
                            
                                Hiding command-line arguments to a Perl script
                            
                                Pseudographical environment in windows Command Prompt
                            
                                What are all the short codes for svn commands?
                            
                                Command line program for playing sections of audio specified in milliseconds
                            
                                Powershell Script doesn't work when starting it by double-clicking

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Disable dictionary in Tesseract

Tags:

command-line

tesseract

sashoalm

People also ask

1 Answers

nguyenq

Recent Activity

Donate For Us