I am trying to get character position of image files using pytesseract library .
import pytesseract
from PIL import Image
print pytesseract.image_to_string(Image.open('5.png'))
Is there any library for getting each position of character
Did you try use pytesseract.image_to_data()?
data = pytesseract.image_to_data(img, output_type='dict')
boxes = len(data['level'])
for i in range(boxes ):
(x, y, w, h) = (data['left'][i], data['top'][i], data['width'][i], data['height'][i])
#Draw box
cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 2)
Using pytesseract doesn't seem the best idea to have the position but you can do this :
from pytesseract import pytesseract
pytesseract.run_tesseract('image.png', 'output', lang=None, boxes=False, config="hocr")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With