I had an image file, which contain some text separated by tabs (2 spaces). But when I extract text out of this image file, I always get a single space between two columns. A sample example: IMAGE: <pre class="prettyprint"><code>col-a col-b col-c </code></pre> Desired output: <pre class="prettyprint"><code>col-a col-b col-c </code></pre> But I am getting the following: <pre class="prettyprint"><code>col-a col-b col-c </code></pre> I am using pytesseract.image_to_string (Python module) convert image to text

Use it like this: <pre class="prettyprint"><code>pytesseract.image_to_string(img, config='-c preserve_interword_spaces=1') </code></pre>

Preserving Spaces in Tesseract

Tags:

python

python-tesseract

I had an image file, which contain some text separated by tabs (2 spaces). But when I extract text out of this image file, I always get a single space between two columns. A sample example:

IMAGE:

col-a    col-b    col-c

Desired output:

col-a    col-b    col-c

But I am getting the following:

col-a col-b col-c

I am using pytesseract.image_to_string (Python module) convert image to text

804

asked Aug 03 '18 08:08

raghu

1 Answers

Use it like this:

pytesseract.image_to_string(img, config='-c preserve_interword_spaces=1')

141

answered Nov 03 '22 00:11

Rajesh Subbiah

Related questions
                            
                                Interpreting sklearns' GridSearchCV best score
                            
                                Pixel-wise loss weight for image segmentation in Keras
                            
                                Python: How to get multiple return values from a threaded function
                            
                                Split numpy array into chunks by maxmimum size
                            
                                How to use the Tensorflow Dataset Pipeline for Variable Length Inputs?
                            
                                Compare values of a dictionary and return a count of matching values
                            
                                Redistribute dictionary value lists
                            
                                Remove the word "module" from Sphinx documentation
                            
                                Pytorch What's the difference between define layer in __init__() and directly use in forward()?
                            
                                Why do changes to a nested dict inside dict2 affect dict1? [duplicate]
                            
                                TF.data.dataset.map(map_func) with Eager Mode
                            
                                Cloud Vision API Client threw an OS Error "too many open files"
                            
                                Is Python class variable static?
                            
                                Get a sub-graph from one node in NetworkX
                            
                                Auto resize tkinter window to fit all widgets
                            
                                Test for import of optional dependencies in __init__.py with pytest: Python 3.5 /3.6 differs in behaviour
                            
                                light gbm - python API vs Scikit-learn API
                            
                                keras LSTM layer takes too long to train
                            
                                Google Dataflow - Failed to import custom python modules
                            
                                PySpark Error When running SQL Query

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With