I am trying to get the coordinates or positions of text character from an Image using Tesseract. I want to know the exact pixel position, so that i can click that text using some other tool.
Edit :
import pytesseract
from pytesseract import pytesseract
import PIL
from PIL import Image
import cv2
import csv
img = 'E:\\OCR-DATA\\sample.jpg'
imge = Image.open(img)
data=pytesseract.image_to_string(imge,lang='eng',boxes=True,config='hocr')
print(data)
data
contains recognized text with box boundary value. But i am not sure , how to use that boundary value to get the co-ordinates of the text.
Value of the data
variable is as follows:
O 100 356 115 373 0
u 117 356 127 368 0
t 130 356 138 372 0
p 141 351 152 368 0
u 154 356 164 368 0
t 167 356 175 371 0
you can try This:
img = 'tes.jpg'
imge = Image.open(img)
data=pytesseract.image_to_boxes(imge)
print(data)
This will directly give you the result Like:
T 22 58 52 97 0
H 62 58 95 96 0
R 102 58 135 97 0
E 146 57 174 97 0
A 184 57 216 96 0
D 225 56 258 96 0
You have the coordinates of the bounding box in every line.
From: Training Tesseract – Make Box Files
character, left, bottom, right, top, page
So for each character you get the character, followed by its bounding box characters, followed by the 0-based page number.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With