Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get rotational shift using phase correlation and log polar transform

I have been working on a script which calculates the rotational shift between two images using cv2's phaseCorrelate method.

I have two images, the second is a 90 degree rotated version of the first image. After loading in the images, I convert them to log-polar before passing them into the phaseCorrelate function.

From what I have read, I believe that this should yield a rotational shift between two images.

The code below describes the implementation.


#bitwise right binary shift function
def rshift(val, n): return (val % 0x100000000)

base_img = cv2.imread('img1.jpg')
cur_img = cv2.imread('dataa//t_sv_1.jpg')

curr_img = rotateImage(cur_img, 90)

rows,cols,chan = base_img.shape
x, y, c = curr_img.shape

#convert images to valid type
ref32 = np.float32(cv2.cvtColor(base_img, cv2.COLOR_BGR2GRAY))
curr32 = np.float32(cv2.cvtColor(curr_img, cv2.COLOR_BGR2GRAY))

value = np.sqrt(((rows/2.0)**2.0)+((cols/2.0)**2.0))
value2 = np.sqrt(((x/2.0)**2.0)+((y/2.0)**2.0))

polar_image = cv2.linearPolar(ref32,(rows/2, cols/2), value, cv2.WARP_FILL_OUTLIERS)
log_img = cv2.linearPolar(curr32,(x/2, y/2), value2, cv2.WARP_FILL_OUTLIERS) 

shift = cv2.phaseCorrelate(polar_image, log_img)

sx = shift[0][0]
sy = shift[0][1]
sf = shift[1]

polar_image = polar_image.astype(np.uint8)
log_img = log_img.astype(np.uint8)

cv2.imshow("Polar Image", polar_image)
cv2.imshow('polar', log_img)

#get rotation from shift along y axis
rotation = sy * 180 / (rshift(y, 1));
print(rotation) 

cv2.waitKey(0)
cv2.destroyAllWindows()

I am unsure how to interpret the results of this function. The expected outcome is a value similar to 90 degrees, however, I get the value below.

Output: -0.00717516014538333

How can I make the output correct?

like image 481
Dylan Freeman Avatar asked Sep 05 '19 08:09

Dylan Freeman


3 Answers

A method, typically referred to as the Fourier Mellin transform, and published as:

B. Srinivasa Reddy and B.N. Chatterji, "An FFT-Based Technique for Translation, Rotation, and Scale-Invariant Image Registration", IEEE Trans. on Image Proc. 5(8):1266-1271, 1996

uses the FFT and the log-polar transform to obtain the translation, rotation and scaling of one image to match the other. I find this tutorial to be very clear and informative, I will give a summary here:

  1. Compute the magnitude of the FFT of the two images (apply a windowing function first to avoid issues with periodicity of the FFT).
  2. Compute the log-polar transform of the magnitude of the frequency-domain images (typically a high-pass filter is applied first, but I have not seen its usefulness).
  3. Compute the cross-correlation (actually phase correlation) between the two. This leads to a knowledge of scale and rotation.
  4. Apply the scaling and rotation to one of the original input images.
  5. Compute the cross-correlation (actually phase correlation) of the original input images, after correction for scaling and rotation. This leads to knowledge of the translation.

This works because:

  1. The magnitude of the FFT is translation-invariant, we can solely focus on scaling and rotation without worrying about translation. Note that the rotation of the image is identical to the rotation of the FFT, and that scaling of the image is inverse to the scaling of the FFT.

  2. The log-polar transform converts rotation into a vertical translation, and scaling into a horizontal translation. Phase correlation allows us to determine these translations. Converting them to a rotation and scaling is non-trivial (especially the scaling is hard to get right, but a bit of math shows the way).

If the tutorial linked above is not clear enough, one can look at the C++ code that comes with it, or at this other Python code.


OP is interested only in the rotation aspect of the method above. If we can assume that the translation is 0 (this means we know around which point the rotation was made, if we don't know the origin we need to estimate it as a translation), then we don't need to compute the magnitude of the FFT (remember it is used to make the problem translation invariant), we can apply the log-polar transform directly to the images. But note that we need to use the center of rotation as the origin for the log-polar transform. If we additionally assume that the scaling is 1, we can further simplify things by taking the linear-polar transform. That is, we logarithmic scaling of the radius axis is only necessary to estimate scaling.

OP is doing this more or less correctly, I believe. Where OP's code goes wrong is in the extent of the radius axis in the polar transform. By going all the way to the extreme corners of the image, OpenCV needs to fill in parts of the transformed image with zeros. These parts are dictated by the shape of the image, not by the contents of the image. That is, both polar images contain exactly the same sharp, high-contrast curve between image content and filled-in zeros. The phase correlation is aligning these curves, leading to an estimate of 0 degree rotation. The image content is more or less ignored because its contrast is much lower.

Instead, make the extent of the radius axis that of the largest circle that fits completely inside the image. This way, no parts of the output need to be filled with zeros, and the phase correlation can focus on the actual image content. Furthermore, considering the two images are rotated versions of each other, it is likely that the data in the corners of the images do not match, there is no need to take that into account at all!

Here is code I implemented quickly based on OP's code. I read in Lena, rotated the image by 38 degrees, computed the linear-polar transform of the original and rotated images, then the phase correlation between these two, and then determined a rotation angle based on the vertical translation. The result was 37.99560, plenty close to 38.

import cv2
import numpy as np

base_img = cv2.imread('lena512color.tif')
base_img = np.float32(cv2.cvtColor(base_img, cv2.COLOR_BGR2GRAY)) / 255.0

(h, w) = base_img.shape
(cX, cY) = (w // 2, h // 2)

angle = 38
M = cv2.getRotationMatrix2D((cX, cY), angle, 1.0)
curr_img = cv2.warpAffine(base_img, M, (w, h))

cv2.imshow("base_img", base_img)
cv2.imshow("curr_img", curr_img)

base_polar = cv2.linearPolar(base_img,(cX, cY), min(cX, cY), 0)
curr_polar = cv2.linearPolar(curr_img,(cX, cY), min(cX, cY), 0) 

cv2.imshow("base_polar", base_polar)
cv2.imshow("curr_polar", curr_polar)

(sx, sy), sf = cv2.phaseCorrelate(base_polar, curr_polar)

rotation = -sy / h * 360;
print(rotation) 

cv2.waitKey(0)
cv2.destroyAllWindows()

These are the four image windows shown by the code:

The four images shown by the code above

like image 194
Cris Luengo Avatar answered Nov 19 '22 17:11

Cris Luengo


Here's an approach to determine the rotational shift between two images in degrees. The idea is to find the skew angle for each image in relation to a horizontal line. If we can find this skewed angle then we can calculate the angle difference between the two images. Here are some example images to illustrate this concept

Original unrotated image

enter image description here

Rotated counterclockwise by 10 degrees (neg_10) and counterclockwise by 35 degrees (neg_35)

enter image description here enter image description here

Rotated clockwise by 7.9 degrees (pos_7_9) and clockwise by 21 degrees (pos_21)

enter image description here enter image description here


For each image, we want to determine the skew angle in relation to a horizontal line with negative being rotated counterclockwise and positive being rotated clockwise

enter image description here enter image description here

Here's the helper function to determine this skew angle

def compute_angle(image):
    # Convert to grayscale, invert, and Otsu's threshold
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    gray = 255 - gray
    thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]

    # Find coordinates of all pixel values greater than zero
    # then compute minimum rotated bounding box of all coordinates
    coords = np.column_stack(np.where(thresh > 0))
    angle = cv2.minAreaRect(coords)[-1]

    # The cv2.minAreaRect() function returns values in the range
    # [-90, 0) so need to correct angle
    if angle < -45:
        angle = -(90 + angle)
    else:
        angle = -angle

    # Rotate image to horizontal position 
    (h, w) = image.shape[:2]
    center = (w // 2, h // 2)
    M = cv2.getRotationMatrix2D(center, angle, 1.0)
    rotated = cv2.warpAffine(image, M, (w, h), flags=cv2.INTER_CUBIC, \
              borderMode=cv2.BORDER_REPLICATE)

    return (angle, rotated)

After determining the skew angle for each image, we can simply calculate the difference

angle1, rotated1 = compute_angle(image1)
angle2, rotated2 = compute_angle(image2)

# Both angles are positive
if angle1 >= 0 and angle2 >= 0:
    difference_angle = abs(angle1 - angle2)
# One positive, one negative
elif (angle1 < 0 and angle2 > 0) or (angle1 > 0 and angle2 < 0):
    difference_angle = abs(angle1) + abs(angle2)
# Both negative
elif angle1 < 0 and angle2 < 0:
    angle1 = abs(angle1)
    angle2 = abs(angle2)
    difference_angle = max(angle1, angle2) - min(angle1, angle2)

Here's the step by step walk through of whats going on. Using pos_21 and neg_10, the compute_angle() function will return the skew angle and the normalized image

For pos_21, we normalize the image and determine the skew angle. Left (before) -> right (after)

enter image description here enter image description here

20.99871826171875

Similarly for neg_10, we also normalize the image and determine the skew angle. Left (before) -> right (after)

enter image description here enter image description here

-10.007980346679688

Now that we have both angles, we can compute the difference angle. Here's the result

31.006698608398438


Here's results with other combinations. With neg_10 and neg_35 we get

24.984039306640625

With pos_7_9 and pos_21,

13.09155559539795

Finally with pos_7_9 and neg_35,

42.89918231964111

Here's the full code

import cv2
import numpy as np

def rotational_shift(image1, image2):
    def compute_angle(image):
        # Convert to grayscale, invert, and Otsu's threshold
        gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
        gray = 255 - gray
        thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]

        # Find coordinates of all pixel values greater than zero
        # then compute minimum rotated bounding box of all coordinates
        coords = np.column_stack(np.where(thresh > 0))
        angle = cv2.minAreaRect(coords)[-1]

        # The cv2.minAreaRect() function returns values in the range
        # [-90, 0) so need to correct angle
        if angle < -45:
            angle = -(90 + angle)
        else:
            angle = -angle

        # Rotate image to horizontal position 
        (h, w) = image.shape[:2]
        center = (w // 2, h // 2)
        M = cv2.getRotationMatrix2D(center, angle, 1.0)
        rotated = cv2.warpAffine(image, M, (w, h), flags=cv2.INTER_CUBIC, \
                  borderMode=cv2.BORDER_REPLICATE)

        return (angle, rotated)

    angle1, rotated1 = compute_angle(image1)
    angle2, rotated2 = compute_angle(image2)

    # Both angles are positive
    if angle1 >= 0 and angle2 >= 0:
        difference_angle = abs(angle1 - angle2)
    # One positive, one negative
    elif (angle1 < 0 and angle2 > 0) or (angle1 > 0 and angle2 < 0):
        difference_angle = abs(angle1) + abs(angle2)
    # Both negative
    elif angle1 < 0 and angle2 < 0:
        angle1 = abs(angle1)
        angle2 = abs(angle2)
        difference_angle = max(angle1, angle2) - min(angle1, angle2)

    return (difference_angle, rotated1, rotated2)

if __name__ == '__main__':
    image1 = cv2.imread('pos_7_9.png')
    image2 = cv2.imread('neg_35.png')

    angle, rotated1, rotated2 = rotational_shift(image1, image2)
    print(angle)
like image 43
nathancy Avatar answered Nov 19 '22 18:11

nathancy


I created a figure that shows the phase correlation values for multiple rotations. This has been edited to reflect Cris Luengo's comment. The image is cropped to get rid of the edges of the square insert.

rotations

import cv2
import numpy as np
paths = ["lena.png", "rotate45.png", "rotate90.png", "rotate135.png", "rotate180.png"]

import os
os.chdir('/home/stephen/Desktop/rotations/')

images, rotations, polar = [],[], []

for image_path in paths:
    alignedImage = cv2.imread('lena.png')
    rotatedImage = cv2.imread(image_path)

    rows,cols,chan = alignedImage.shape
    x, y, c = rotatedImage.shape

    x,y,w,h = 220,220,360,360
    alignedImage = alignedImage[y:y+h, x:x+h].copy()
    rotatedImage = rotatedImage[y:y+h, x:x+h].copy()

    #convert images to valid type
    ref32 = np.float32(cv2.cvtColor(alignedImage, cv2.COLOR_BGR2GRAY))
    curr32 = np.float32(cv2.cvtColor(rotatedImage, cv2.COLOR_BGR2GRAY))

    value = np.sqrt(((rows/2.0)**2.0)+((cols/2.0)**2.0))
    value2 = np.sqrt(((x/2.0)**2.0)+((y/2.0)**2.0))

    polar_image = cv2.linearPolar(ref32,(rows/2, cols/2), value, cv2.WARP_FILL_OUTLIERS)
    log_img = cv2.linearPolar(curr32,(x/2, y/2), value2, cv2.WARP_FILL_OUTLIERS) 

    shift = cv2.phaseCorrelate(polar_image, log_img)
    (sx, sy), sf = shift

    polar_image = polar_image.astype(np.uint8)
    log_img = log_img.astype(np.uint8)

    sx, sy, sf = round(sx, 4), round(sy, 4), round(sf, 4)
    text = image_path + "\n" + "sx: " + str(sx) + " \nsy: " + str(sy) + " \nsf: " + str(sf)

    images.append(rotatedImage)
    rotations.append(text)
    polar.append(polar_image)
like image 3
Stephen Meschke Avatar answered Nov 19 '22 18:11

Stephen Meschke