How to improve tesseract.js accuracy?

Tags:

Im using this piece of code from the website but its not accurate enough

 const worker1 = createWorker();
  const worker2 = createWorker();

  await worker1.load();
  await worker2.load();
  await worker1.loadLanguage("eng");
  await worker2.loadLanguage("eng");
  await worker1.initialize("eng");
  await worker2.initialize("eng");

  scheduler.addWorker(worker1);
  scheduler.addWorker(worker2);

  /** Add 10 recognition jobs */
  const {
    data: { text }
  } = await scheduler.addJob("recognize", image);

this is the type of image i'm trying to read its text:

enter image description here

thou it seems simple and easy ,sometimes tesseract fails to read it . is there any better alternatives to tesseract.js or any way to improve the accuracy?

455

asked Dec 01 '19 13:12

PayamB.

1 Answers

When applying OCR using Tesseract, it is important to preprocess the image so that the desired text to detect is in black with the background in white. To do this, you can apply a simple threshold to obtain a binary image. Here's the image after preprocessing:

enter image description here

Result from Tesseract

Click to copy

I implemented this approach in Python OpenCV, but you can adapt a similar strategy into Javascript!

Click to copy

import cv2
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

# Load image and Otsu's Threshold to get a binary image
image = cv2.imread('1.png', 0)
thresh = cv2.threshold(image, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# Perform OCR
data = pytesseract.image_to_string(thresh, lang='eng', config='--psm 6')
print(data)

cv2.imshow('thresh', thresh)
cv2.waitKey()

answered Oct 17 '22 00:10

nathancy

Related questions
                            
                                In JavaScript how do I make a dynamically created tablerow clickable without using JQuery?
                            
                                Unhandled JS Exception: Cannot create styled-component for component: [object Object]
                            
                                Cloudfront and Lambda@Edge: Remove response header
                            
                                How to POST relation in Strapi
                            
                                How to have relative paths in eleventy?
                            
                                True difference between HttpRequest and XMLHttpRequest
                            
                                NGINX JavaScript module with a session storage (like Redis)
                            
                                Proper way to include a .js file in gatsby which executes its code on all the pages
                            
                                Rendering HTML page in CKEditor 5
                            
                                Why is svg not identified inside a node using .contain method using javascript?
                            
                                How to fix "Module not found" error in Angular library with npm link?
                            
                                Javascript won't set httpcookie received in XHR response
                            
                                Make object or class property only invocable
                            
                                How to create a line connection mxGraph
                            
                                Angular 8 - handling SSE reconnect on error
                            
                                Why does my parcel-bundler fail even on the simplest things?
                            
                                Sequelize Seeding ARRAY(ENUM)
                            
                                javascript: open all links to pdf on a page in new window
                            
                                Can I use Javascript to update a default class in the Wordpress Editor?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to improve tesseract.js accuracy?

Tags:

javascript

node.js

typescript

ocr

tesseract.js

PayamB.

People also ask

1 Answers

nathancy

Recent Activity

Donate For Us