I'm trying to implement the widely used fingerprint image enhancement algorithm proposed by Anil Jain et al. While implementing the steps for ridge frequency image calculation in Section 2.5, I have difficulties in understanding some description. The steps are described as follows:
If no minutiae and singular points appear in the oriented window, the x-signature forms a discrete sinusoidal-shape wave, which has the same frequency as that of the ridges and valleys int he oriented window. Therefore, the frequency of ridges and valleys can be estimated from the x-signature. Let T(i,j) be the average number of pixels between two consecutive peaks in the x-signature, then the frequency is computed as:
My question is: I don't understand how to get the average number of pixels between two consecutive peaks, since the paper didn't mention how to differentiate peaks within the algorithm. So, how to decide those peak pixels to count them? Could someone explain me what did I miss here?
Besides, I implemented the steps up-to-here using OpenCV like this, I would really appreciate if someone could go through my steps to help me double check that I am implementing correctly:
void Enhancement::frequency(cv::Mat inputImage, cv::Mat orientationMat)
{
int blockSize = 16;
int windowSize = 32;
//compute x-signature
for (int i = blockSize / 2; i < inputImage.rows - blockSize / 2; i += blockSize)
{
for (int j = blockSize / 2; j < inputImage.cols - blockSize / 2; j += blockSize)
{
int u = 0;
int v = 0;
std::vector<float> xSignature;
for (int k = 0; k < windowSize; k++)
{
float sum = 0.0;
for (int d = 0; d < blockSize; d++)
{
float pixel = orientationMat.at<float>(i, j);
u = i + (d - 0.5 * blockSize) * cos(pixel) + (k - 0.5 * windowSize) * sin(pixel);
v = j + (d - 0.5 * blockSize) * sin(pixel) + (0.5 - windowSize) * cos(pixel);
sum += static_cast<float>(inputImage.at<uchar>(u, v));
}
xSignature.push_back(sum);
}
} // end of j-loop
} // end of i-loop
}
Update
After searching some articles, I found someone mentioned about how to determine whether a peak pixel like this:
But still, I didn't understand it clearly. Does that mean I can employ block-wise morphological dilation operation on my grayscale image (I've already converted my image from RGB to Grayscale in OpenCV before further processing) ?
Does the word dilation equals original values
means the pixel intensity after morphological dilation equals its original value
? I'm lost here.
The image frequency is 60 Hz. The image is partitioned into subframes of 16 × 16 pixels which are processed in the 2-D DCT system. Hence, the requirement is
Now computer images are composed of pixels, and every pixel have an intensity value for Red, Green, Blue aka RGB values. In grayscale images the intensity for R, G, B of any pixel are equal, R=G=B=I so we can talk about I for grayscale images. Now it is easy to see that this image has a horizontal frequency of 10.
In an image, a frequency represents a slow variation (low frequency) or rapid transitions of gray level (high frequency). The contours (details) of objects can be treated as high frequencies while homogenous areas with the same shade of gray level as the low frequency.
The amount of these pixels and the way they are distributed are the two factors that you need to consider to understand resolution. The first kind of resolution refers to the pixel count which is the number of pixels that form your photo.
I do not know the specific algorithm you are talking about, but maybe I can offer some general advice.
I guess the core of the problem is the distinction "what is a peak, what is just noise" in a noisy signal (since RL input images are always noisy in some sense; I think the relevant input vector for peak detection in your code is xSignature). Once you have determined the peaks, calculating an average peak distance should be fairly straightforward.
As for peak detection, there are tons of papers describing quite sophisticated algorithms, but I'll outline some tried and true methods I'm using in my image processing job.
If you know the expected peak width w, you can as a first step apply some smoothing that gets rid of noise on a smaller scale by just summing over a window of about the expected peak width (from x-w/2 to x+w/2). You don't actually need to calculate the average value of the sliding window (divide by w), since for peak detection the absolute scale is irrelevant and the sum is proportional to the average value.
You can run over your (potentially smoothed) profile vector and identify minimum and maximum indices (e.g. by simple slope sign change). Store these positions in a map<int (coordinate), bool (isMax)>
or map<int (coordinate), double (value at coordinate)>
. Or use a struct as value that holds all the relevant info (bool isMax, double value, bool isAtBoundary, ...)
For each maximum you found in the previous step, determine the height difference and maybe the slope to both the previous and the following minimum, resulting in a quality. This step depends on your problem domain. Maybe "peaks" need not be framed by a minimum value on both sides (in that case, your minimum detection above would have to be more sophisticated than slope change). Maybe there are min or max width restrictions on peaks. And so on.
Calculate a quality value based on the above questions for each maximum position. I often use something like Q_max = (average height difference from max to neighboring mins) / (max-min of profile). A peak candidate can then at most have a "quality" of 1, and at least 0.
Iterate over all your maxima, calculate their qualities and put them into a multimap or some other container, that can be sorted so that you can later iterate over your peaks in descending quality.
Iterate over your peaks in descending quality. Possibly sort out all that do not fulfill minimum or maximum width/height/quality/distance to nearest peak with higher quality/... requirements for them to be peaks in your problem domain. Keep the rest. Done.
In your case, you would then reorder the peaks by coordinate and calculate the average distance between them.
I know this is vague, but there are no universally true answers for peak detection. Maybe in the paper you are working with there is a specific prescription hidden somewhere, but most authors omit such "mere technicalities" (and typically, if you contact them via email they can't remember or otherwise reproduce how they did it, which renders their results basically irreproducible).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With