I would like to be able to foveate an image with focal point at the center of the image in Python. My input image can be represented as a 2D Numpy array. I'd like to get an output image with high resolution at the center, but blurry at the sides. I found an OpenCV function called logplar_interp
for this purpose, but it does not seem to be present in the Python wrap of OpenCV. I appreciate any help.
An example of a foveated image is shown below (taken from Wikipedia):
The point of focus are the tombstones towards the top right while the rest of the pixels become progressively blurred as you move away from the point of focus.
Here's my attempt at recreating this using OpenCV Python. It's a rather hack-ish solution that is a bit computationally intensive, but it certainly gets the job done.
First, create a mask where pixels that are zero correspond to those pixels you want to keep at high resolution and pixels that are one correspond to those pixels you want to blur. To make things simple, I would create a circle of dark pixels that define the high resolution pixels.
With this mask, one tool that I can suggest to make this work is to use the distance transform on this mask. For each point in the binary mask, the corresponding output point in the distance transform is the distance from this point to the closest zero pixel. As such, as you venture far away from the zero pixels in the mask, the greater the distance would be.
Therefore the farther away you go from a zero pixel in this mask, the more blur you apply. Using this idea, I simply wrote a loop through the image and at each point, create a blur mask - whether it be averaging or Gaussian or anything related to that - that is proportional to the distance in the distance transform and blur this point with that blur mask. Any values that are zero in this mask should have no blur applied to it. For all of the other points in the mask, we use the values in the mask to guide us in collecting a neighbourhood of pixels centered at this point and perform a blur. The larger the distance, the larger the pixel neighbourhood should be and thus the stronger the blur will be.
To simplify things, I'm going to use an averaging mask. Specifically, for each value in the distance transform, the size of this mask will be M x M
where M
is:
M = d / S
d
is a distance value from the distance transform and S
is a scale factor that scales down the value of d
so that the averaging can be more feasible. This is because the distance transform can get quite large as you go farther away from a zero pixel, and so the scale factor makes the averaging more realistic. Formally, for each pixel in our output, we collect a neighbourhood of M x M
pixels, get an average and set this to be our output.
One intricacy that we need to keep in mind is that when we collect pixels where the centre of the neighbourhood is along the border of the image, we need to make sure that we collect pixels within the boundaries of the image so any locations that go outside of the image, we skip.
Now it's time to show some results. For reference, I used the Camera Man image, that is a standard testing image and is very popular. It is shown here:
I'm also going to set the mask to be located at row 70 and column 100 to be a circle of radius 25. Without further ado, here's the code fully commented. I'll let you parse through the comments yourself.
import cv2 # Import relevant libraries
import cv
import numpy as np
img = cv2.imread('cameraman.png', 0) # Read in image
height = img.shape[0] # Get the dimensions
width = img.shape[1]
# Define mask
mask = 255*np.ones(img.shape, dtype='uint8')
# Draw circle at x = 100, y = 70 of radius 25 and fill this in with 0
cv2.circle(mask, (100, 70), 25, 0, -1)
# Apply distance transform to mask
out = cv2.distanceTransform(mask, cv.CV_DIST_L2, 3)
# Define scale factor
scale_factor = 10
# Create output image that is the same as the original
filtered = img.copy()
# Create floating point copy for precision
img_float = img.copy().astype('float')
# Number of channels
if len(img_float.shape) == 3:
num_chan = img_float.shape[2]
else:
# If there is a single channel, make the images 3D with a singleton
# dimension to allow for loop to work properly
num_chan = 1
img_float = img_float[:,:,None]
filtered = filtered[:,:,None]
# For each pixel in the input...
for y in range(height):
for x in range(width):
# If distance transform is 0, skip
if out[y,x] == 0.0:
continue
# Calculate M = d / S
mask_val = np.ceil(out[y,x] / scale_factor)
# If M is too small, set the mask size to the smallest possible value
if mask_val <= 3:
mask_val = 3
# Get beginning and ending x and y coordinates for neighbourhood
# and ensure they are within bounds
beginx = x-int(mask_val/2)
if beginx < 0:
beginx = 0
beginy = y-int(mask_val/2)
if beginy < 0:
beginy = 0
endx = x+int(mask_val/2)
if endx >= width:
endx = width-1
endy = y+int(mask_val/2)
if endy >= height:
endy = height-1
# Get the coordinates of where we need to grab pixels
xvals = np.arange(beginx, endx+1)
yvals = np.arange(beginy, endy+1)
(col_neigh,row_neigh) = np.meshgrid(xvals, yvals)
col_neigh = col_neigh.astype('int')
row_neigh = row_neigh.astype('int')
# Get the pixels now
# For each channel, do the foveation
for ii in range(num_chan):
chan = img_float[:,:,ii]
pix = chan[row_neigh, col_neigh].ravel()
# Calculate the average and set it to be the output
filtered[y,x,ii] = int(np.mean(pix))
# Remove singleton dimension if required for display and saving
if num_chan == 1:
filtered = filtered[:,:,0]
# Show the image
cv2.imshow('Output', filtered)
cv2.waitKey(0)
cv2.destroyAllWindows()
The output I get is:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With