I am trying to isolate text from an image with openCV before sending it to tesseract4 engine to maximize results.
I found this interesting post and I decided to copy the source and try by mysdelf
However I am getting issue with the first call to OpenCV
To reproduce:
Simply copy the code from the gist
launch command script.py /path/to/image.jpg
I am getting issue:
Required argument 'threshold2' (pos 4) not found
Do you maybe have an idea of what does it means. I am a javascript, java and bash script developer but not python...
In a simple version:
import glob
import os
import random
import sys
import random
import math
import json
from collections import defaultdict
import cv2
from PIL import Image, ImageDraw
import numpy as np
from scipy.ndimage.filters import rank_filter
if __name__ == '__main__':
if len(sys.argv) == 2 and '*' in sys.argv[1]:
files = glob.glob(sys.argv[1])
random.shuffle(files)
else:
files = sys.argv[1:]
for path in files:
out_path = path.replace('.jpg', '.crop.png')
if os.path.exists(out_path): continue
orig_im = Image.open(path)
edges = cv2.Canny(np.asarray(orig_im), 100, 200)
Thanks in advance for your help
Edit: okay so this answer is apparently wrong, as I tried to send my own 16-bit int image into the function and couldn't reproduce the results.
Edit2: So I can reproduce the error with the following:
from PIL import Image
import numpy as np
import cv2
orig_im = Image.open('opencv-logo2.png')
threshold1 = 50
threshold2 = 150
edges = cv2.Canny(orig_im, 50, 100)
TypeError: Required argument 'threshold2' (pos 4) not found
So if the image was not cast to an array, i.e., the Image
class was passed in, I get the error. The PIL Image
class is a class with a lot of things other than the image data associated to it, so casting to a np.array
is necessary to pass into functions. But if it was properly cast, everything runs swell for me.
In a chat with Dan Mašek, my idea below is a bit incorrect. It is true that the newer Canny()
method needs 16-bit images, but the bindings don't look into the actual numpy dtype
to see what bit-depth it is to decide which function call to use. Plus, if you try to actually send a uint16
image in, you get a different error:
edges = cv2.Canny(np.array([[0, 1234], [1234, 2345]], dtype=np.uint16), 50, 100)
error: (-215) depth == CV_8U in function Canny
So the answer I originally gave (below) is not the total culprit. Perhaps you accidentally removed the np.array()
casting of the orig_im
and got that error, or, something else weird is going on.
Original (wrong) answer
In OpenCV 3.2.0, a new method for Canny()
was introduced to allow users to specify their own gradient image. In the original implementation, Canny()
would use the Sobel()
operator for calculating the gradients, but now you could calculate say the Scharr()
derivatives and pass those into Canny()
instead. So that's pretty cool. But what does this have to do with your problem?
The Canny()
method is overloaded. And it decides which function you want to use based on the arguments you send in. The original call for Canny()
with the required arguments looks like
cv2.Canny(image, threshold1, threshold2)
but the new overloaded method looks like
cv2.Canny(grad_x, grad_y, threshold1, threshold2)
Now, there was a hint in your error message:
Required argument 'threshold2' (pos 4) not found
Which one of these calls had threshold2
in position 4? The newer method call! So why was that being called if you only passed three args? Note that you were getting the error if you used a PIL
image, but not if you used a numpy
image. So what else made it assume you were using the new call?
If you check the OpenCV 3.3.0 Canny()
docs, you'll see that the original Canny()
call requires an 8-bit input image for the first positional argument, whereas the new Canny()
call requires a 16-bit x derivative of input image (CV_16SC1 or CV_16SC3) for the first positional argument.
Putting two and two together, PIL was giving you a 16-bit input image, so OpenCV thought you were trying to call the new method.
So the solution here, if you wanted to continue using PIL, is to convert your image to an 8-bit representation. Canny()
needs a single-channel (i.e. grayscale) image to run, first off. So you'll need to make sure the image
is single-channel first, and then scale it and change the numpy dtype
. I believe PIL will read a grayscale image as single channel (OpenCV by default reads all images as three-channel unless you tell it otherwise).
If the image is 16-bit, then the conversion is easy with numpy:
img = (img/256).astype('uint8')
This assumes img
is a numpy array, so you would need to cast the PIL image to ndarray
first with np.array()
or np.asarray()
.
And then you should be able to run Canny()
with the original function call.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With