I have been working on a script which calculates the rotational shift between two images using cv2's phaseCorrelate
method.
I have two images, the second is a 90 degree rotated version of the first image. After loading in the images, I convert them to log-polar before passing them into the phaseCorrelate
function.
From what I have read, I believe that this should yield a rotational shift between two images.
The code below describes the implementation.
#bitwise right binary shift function
def rshift(val, n): return (val % 0x100000000)
base_img = cv2.imread('img1.jpg')
cur_img = cv2.imread('dataa//t_sv_1.jpg')
curr_img = rotateImage(cur_img, 90)
rows,cols,chan = base_img.shape
x, y, c = curr_img.shape
#convert images to valid type
ref32 = np.float32(cv2.cvtColor(base_img, cv2.COLOR_BGR2GRAY))
curr32 = np.float32(cv2.cvtColor(curr_img, cv2.COLOR_BGR2GRAY))
value = np.sqrt(((rows/2.0)**2.0)+((cols/2.0)**2.0))
value2 = np.sqrt(((x/2.0)**2.0)+((y/2.0)**2.0))
polar_image = cv2.linearPolar(ref32,(rows/2, cols/2), value, cv2.WARP_FILL_OUTLIERS)
log_img = cv2.linearPolar(curr32,(x/2, y/2), value2, cv2.WARP_FILL_OUTLIERS)
shift = cv2.phaseCorrelate(polar_image, log_img)
sx = shift[0][0]
sy = shift[0][1]
sf = shift[1]
polar_image = polar_image.astype(np.uint8)
log_img = log_img.astype(np.uint8)
cv2.imshow("Polar Image", polar_image)
cv2.imshow('polar', log_img)
#get rotation from shift along y axis
rotation = sy * 180 / (rshift(y, 1));
print(rotation)
cv2.waitKey(0)
cv2.destroyAllWindows()
I am unsure how to interpret the results of this function. The expected outcome is a value similar to 90 degrees, however, I get the value below.
Output: -0.00717516014538333
How can I make the output correct?
A method, typically referred to as the Fourier Mellin transform, and published as:
B. Srinivasa Reddy and B.N. Chatterji, "An FFT-Based Technique for Translation, Rotation, and Scale-Invariant Image Registration", IEEE Trans. on Image Proc. 5(8):1266-1271, 1996
uses the FFT and the log-polar transform to obtain the translation, rotation and scaling of one image to match the other. I find this tutorial to be very clear and informative, I will give a summary here:
This works because:
The magnitude of the FFT is translation-invariant, we can solely focus on scaling and rotation without worrying about translation. Note that the rotation of the image is identical to the rotation of the FFT, and that scaling of the image is inverse to the scaling of the FFT.
The log-polar transform converts rotation into a vertical translation, and scaling into a horizontal translation. Phase correlation allows us to determine these translations. Converting them to a rotation and scaling is non-trivial (especially the scaling is hard to get right, but a bit of math shows the way).
If the tutorial linked above is not clear enough, one can look at the C++ code that comes with it, or at this other Python code.
OP is interested only in the rotation aspect of the method above. If we can assume that the translation is 0 (this means we know around which point the rotation was made, if we don't know the origin we need to estimate it as a translation), then we don't need to compute the magnitude of the FFT (remember it is used to make the problem translation invariant), we can apply the log-polar transform directly to the images. But note that we need to use the center of rotation as the origin for the log-polar transform. If we additionally assume that the scaling is 1, we can further simplify things by taking the linear-polar transform. That is, we logarithmic scaling of the radius axis is only necessary to estimate scaling.
OP is doing this more or less correctly, I believe. Where OP's code goes wrong is in the extent of the radius axis in the polar transform. By going all the way to the extreme corners of the image, OpenCV needs to fill in parts of the transformed image with zeros. These parts are dictated by the shape of the image, not by the contents of the image. That is, both polar images contain exactly the same sharp, high-contrast curve between image content and filled-in zeros. The phase correlation is aligning these curves, leading to an estimate of 0 degree rotation. The image content is more or less ignored because its contrast is much lower.
Instead, make the extent of the radius axis that of the largest circle that fits completely inside the image. This way, no parts of the output need to be filled with zeros, and the phase correlation can focus on the actual image content. Furthermore, considering the two images are rotated versions of each other, it is likely that the data in the corners of the images do not match, there is no need to take that into account at all!
Here is code I implemented quickly based on OP's code. I read in Lena, rotated the image by 38 degrees, computed the linear-polar transform of the original and rotated images, then the phase correlation between these two, and then determined a rotation angle based on the vertical translation. The result was 37.99560, plenty close to 38.
import cv2
import numpy as np
base_img = cv2.imread('lena512color.tif')
base_img = np.float32(cv2.cvtColor(base_img, cv2.COLOR_BGR2GRAY)) / 255.0
(h, w) = base_img.shape
(cX, cY) = (w // 2, h // 2)
angle = 38
M = cv2.getRotationMatrix2D((cX, cY), angle, 1.0)
curr_img = cv2.warpAffine(base_img, M, (w, h))
cv2.imshow("base_img", base_img)
cv2.imshow("curr_img", curr_img)
base_polar = cv2.linearPolar(base_img,(cX, cY), min(cX, cY), 0)
curr_polar = cv2.linearPolar(curr_img,(cX, cY), min(cX, cY), 0)
cv2.imshow("base_polar", base_polar)
cv2.imshow("curr_polar", curr_polar)
(sx, sy), sf = cv2.phaseCorrelate(base_polar, curr_polar)
rotation = -sy / h * 360;
print(rotation)
cv2.waitKey(0)
cv2.destroyAllWindows()
These are the four image windows shown by the code:
Here's an approach to determine the rotational shift between two images in degrees. The idea is to find the skew angle for each image in relation to a horizontal line. If we can find this skewed angle then we can calculate the angle difference between the two images. Here are some example images to illustrate this concept
Original unrotated image
Rotated counterclockwise by 10 degrees (neg_10
) and counterclockwise by 35 degrees (neg_35
)
Rotated clockwise by 7.9 degrees (pos_7_9
) and clockwise by 21 degrees (pos_21
)
For each image, we want to determine the skew angle in relation to a horizontal line with negative being rotated counterclockwise and positive being rotated clockwise
Here's the helper function to determine this skew angle
def compute_angle(image):
# Convert to grayscale, invert, and Otsu's threshold
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = 255 - gray
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
# Find coordinates of all pixel values greater than zero
# then compute minimum rotated bounding box of all coordinates
coords = np.column_stack(np.where(thresh > 0))
angle = cv2.minAreaRect(coords)[-1]
# The cv2.minAreaRect() function returns values in the range
# [-90, 0) so need to correct angle
if angle < -45:
angle = -(90 + angle)
else:
angle = -angle
# Rotate image to horizontal position
(h, w) = image.shape[:2]
center = (w // 2, h // 2)
M = cv2.getRotationMatrix2D(center, angle, 1.0)
rotated = cv2.warpAffine(image, M, (w, h), flags=cv2.INTER_CUBIC, \
borderMode=cv2.BORDER_REPLICATE)
return (angle, rotated)
After determining the skew angle for each image, we can simply calculate the difference
angle1, rotated1 = compute_angle(image1)
angle2, rotated2 = compute_angle(image2)
# Both angles are positive
if angle1 >= 0 and angle2 >= 0:
difference_angle = abs(angle1 - angle2)
# One positive, one negative
elif (angle1 < 0 and angle2 > 0) or (angle1 > 0 and angle2 < 0):
difference_angle = abs(angle1) + abs(angle2)
# Both negative
elif angle1 < 0 and angle2 < 0:
angle1 = abs(angle1)
angle2 = abs(angle2)
difference_angle = max(angle1, angle2) - min(angle1, angle2)
Here's the step by step walk through of whats going on. Using pos_21
and neg_10
, the compute_angle()
function will return the skew angle and the normalized image
For pos_21
, we normalize the image and determine the skew angle. Left (before) ->
right (after)
20.99871826171875
Similarly for neg_10
, we also normalize the image and determine the skew angle. Left (before) ->
right (after)
-10.007980346679688
Now that we have both angles, we can compute the difference angle. Here's the result
31.006698608398438
Here's results with other combinations. With neg_10
and neg_35
we get
24.984039306640625
With pos_7_9
and pos_21
,
13.09155559539795
Finally with pos_7_9
and neg_35
,
42.89918231964111
Here's the full code
import cv2
import numpy as np
def rotational_shift(image1, image2):
def compute_angle(image):
# Convert to grayscale, invert, and Otsu's threshold
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = 255 - gray
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
# Find coordinates of all pixel values greater than zero
# then compute minimum rotated bounding box of all coordinates
coords = np.column_stack(np.where(thresh > 0))
angle = cv2.minAreaRect(coords)[-1]
# The cv2.minAreaRect() function returns values in the range
# [-90, 0) so need to correct angle
if angle < -45:
angle = -(90 + angle)
else:
angle = -angle
# Rotate image to horizontal position
(h, w) = image.shape[:2]
center = (w // 2, h // 2)
M = cv2.getRotationMatrix2D(center, angle, 1.0)
rotated = cv2.warpAffine(image, M, (w, h), flags=cv2.INTER_CUBIC, \
borderMode=cv2.BORDER_REPLICATE)
return (angle, rotated)
angle1, rotated1 = compute_angle(image1)
angle2, rotated2 = compute_angle(image2)
# Both angles are positive
if angle1 >= 0 and angle2 >= 0:
difference_angle = abs(angle1 - angle2)
# One positive, one negative
elif (angle1 < 0 and angle2 > 0) or (angle1 > 0 and angle2 < 0):
difference_angle = abs(angle1) + abs(angle2)
# Both negative
elif angle1 < 0 and angle2 < 0:
angle1 = abs(angle1)
angle2 = abs(angle2)
difference_angle = max(angle1, angle2) - min(angle1, angle2)
return (difference_angle, rotated1, rotated2)
if __name__ == '__main__':
image1 = cv2.imread('pos_7_9.png')
image2 = cv2.imread('neg_35.png')
angle, rotated1, rotated2 = rotational_shift(image1, image2)
print(angle)
I created a figure that shows the phase correlation values for multiple rotations. This has been edited to reflect Cris Luengo's comment. The image is cropped to get rid of the edges of the square insert.
import cv2
import numpy as np
paths = ["lena.png", "rotate45.png", "rotate90.png", "rotate135.png", "rotate180.png"]
import os
os.chdir('/home/stephen/Desktop/rotations/')
images, rotations, polar = [],[], []
for image_path in paths:
alignedImage = cv2.imread('lena.png')
rotatedImage = cv2.imread(image_path)
rows,cols,chan = alignedImage.shape
x, y, c = rotatedImage.shape
x,y,w,h = 220,220,360,360
alignedImage = alignedImage[y:y+h, x:x+h].copy()
rotatedImage = rotatedImage[y:y+h, x:x+h].copy()
#convert images to valid type
ref32 = np.float32(cv2.cvtColor(alignedImage, cv2.COLOR_BGR2GRAY))
curr32 = np.float32(cv2.cvtColor(rotatedImage, cv2.COLOR_BGR2GRAY))
value = np.sqrt(((rows/2.0)**2.0)+((cols/2.0)**2.0))
value2 = np.sqrt(((x/2.0)**2.0)+((y/2.0)**2.0))
polar_image = cv2.linearPolar(ref32,(rows/2, cols/2), value, cv2.WARP_FILL_OUTLIERS)
log_img = cv2.linearPolar(curr32,(x/2, y/2), value2, cv2.WARP_FILL_OUTLIERS)
shift = cv2.phaseCorrelate(polar_image, log_img)
(sx, sy), sf = shift
polar_image = polar_image.astype(np.uint8)
log_img = log_img.astype(np.uint8)
sx, sy, sf = round(sx, 4), round(sy, 4), round(sf, 4)
text = image_path + "\n" + "sx: " + str(sx) + " \nsy: " + str(sy) + " \nsf: " + str(sf)
images.append(rotatedImage)
rotations.append(text)
polar.append(polar_image)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With