Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to understand the average pixel number described in the frequency image?

I'm trying to implement the widely used fingerprint image enhancement algorithm proposed by Anil Jain et al. While implementing the steps for ridge frequency image calculation in Section 2.5, I have difficulties in understanding some description. The steps are described as follows:

  1. Obtain normalized image G.
  2. Divide G into blocks of size w x w (16 x 16).
  3. For each block centered at the pixel (i, j), compute an oriented window of size l x w (32x16) that is defined in the ridge coordinate system.
  4. For each block centered at pixel (i,j), compute the x-signature, X[0], X1, ..., X[l-1], of the ridges and valleys within the oriented window, where

enter image description here

enter image description here

If no minutiae and singular points appear in the oriented window, the x-signature forms a discrete sinusoidal-shape wave, which has the same frequency as that of the ridges and valleys int he oriented window. Therefore, the frequency of ridges and valleys can be estimated from the x-signature. Let T(i,j) be the average number of pixels between two consecutive peaks in the x-signature, then the frequency is computed as:

enter image description here

My question is: I don't understand how to get the average number of pixels between two consecutive peaks, since the paper didn't mention how to differentiate peaks within the algorithm. So, how to decide those peak pixels to count them? Could someone explain me what did I miss here?

Besides, I implemented the steps up-to-here using OpenCV like this, I would really appreciate if someone could go through my steps to help me double check that I am implementing correctly:

void Enhancement::frequency(cv::Mat inputImage, cv::Mat orientationMat)
{
    int blockSize = 16;
    int windowSize = 32;

    //compute x-signature
    for (int i = blockSize / 2; i < inputImage.rows - blockSize / 2; i += blockSize)
    {
        for (int j = blockSize / 2; j < inputImage.cols - blockSize / 2; j += blockSize)
        {
            int u = 0; 
            int v = 0;
            std::vector<float> xSignature;

            for (int k = 0; k < windowSize; k++)            
            {
                float sum = 0.0;

                for (int d = 0; d < blockSize; d++)
                {
                    float pixel = orientationMat.at<float>(i, j);

                    u = i + (d - 0.5 * blockSize) * cos(pixel) + (k - 0.5 * windowSize) * sin(pixel);
                    v = j + (d - 0.5 * blockSize) * sin(pixel) + (0.5 - windowSize) * cos(pixel);
                    sum += static_cast<float>(inputImage.at<uchar>(u, v));
                }

                xSignature.push_back(sum);
            }
        } // end of j-loop
    } // end of i-loop

}

Update

After searching some articles, I found someone mentioned about how to determine whether a peak pixel like this:

  1. Perform grayscale dilation on each block
  2. Find where the dilation equals original values

But still, I didn't understand it clearly. Does that mean I can employ block-wise morphological dilation operation on my grayscale image (I've already converted my image from RGB to Grayscale in OpenCV before further processing) ? Does the word dilation equals original values means the pixel intensity after morphological dilation equals its original value? I'm lost here.

like image 250
E_learner Avatar asked Jan 30 '15 07:01

E_learner


People also ask

What is the image frequency of a 2-D image?

The image frequency is 60 Hz. The image is partitioned into subframes of 16 × 16 pixels which are processed in the 2-D DCT system. Hence, the requirement is

What is the horizontal frequency of the grayscale image?

Now computer images are composed of pixels, and every pixel have an intensity value for Red, Green, Blue aka RGB values. In grayscale images the intensity for R, G, B of any pixel are equal, R=G=B=I so we can talk about I for grayscale images. Now it is easy to see that this image has a horizontal frequency of 10.

What does the frequency represent in an image?

In an image, a frequency represents a slow variation (low frequency) or rapid transitions of gray level (high frequency). The contours (details) of objects can be treated as high frequencies while homogenous areas with the same shade of gray level as the low frequency.

What determines the resolution of a photo?

The amount of these pixels and the way they are distributed are the two factors that you need to consider to understand resolution. The first kind of resolution refers to the pixel count which is the number of pixels that form your photo.


Video Answer


1 Answers

I do not know the specific algorithm you are talking about, but maybe I can offer some general advice.

I guess the core of the problem is the distinction "what is a peak, what is just noise" in a noisy signal (since RL input images are always noisy in some sense; I think the relevant input vector for peak detection in your code is xSignature). Once you have determined the peaks, calculating an average peak distance should be fairly straightforward.

As for peak detection, there are tons of papers describing quite sophisticated algorithms, but I'll outline some tried and true methods I'm using in my image processing job.

Smoothing

If you know the expected peak width w, you can as a first step apply some smoothing that gets rid of noise on a smaller scale by just summing over a window of about the expected peak width (from x-w/2 to x+w/2). You don't actually need to calculate the average value of the sliding window (divide by w), since for peak detection the absolute scale is irrelevant and the sum is proportional to the average value.

Min-Max-Identification

You can run over your (potentially smoothed) profile vector and identify minimum and maximum indices (e.g. by simple slope sign change). Store these positions in a map<int (coordinate), bool (isMax)> or map<int (coordinate), double (value at coordinate)>. Or use a struct as value that holds all the relevant info (bool isMax, double value, bool isAtBoundary, ...)

Evaluate quality of detected peaks

For each maximum you found in the previous step, determine the height difference and maybe the slope to both the previous and the following minimum, resulting in a quality. This step depends on your problem domain. Maybe "peaks" need not be framed by a minimum value on both sides (in that case, your minimum detection above would have to be more sophisticated than slope change). Maybe there are min or max width restrictions on peaks. And so on.

Calculate a quality value based on the above questions for each maximum position. I often use something like Q_max = (average height difference from max to neighboring mins) / (max-min of profile). A peak candidate can then at most have a "quality" of 1, and at least 0.

Iterate over all your maxima, calculate their qualities and put them into a multimap or some other container, that can be sorted so that you can later iterate over your peaks in descending quality.

Distinguish peaks from non-peaks

Iterate over your peaks in descending quality. Possibly sort out all that do not fulfill minimum or maximum width/height/quality/distance to nearest peak with higher quality/... requirements for them to be peaks in your problem domain. Keep the rest. Done.

In your case, you would then reorder the peaks by coordinate and calculate the average distance between them.

I know this is vague, but there are no universally true answers for peak detection. Maybe in the paper you are working with there is a specific prescription hidden somewhere, but most authors omit such "mere technicalities" (and typically, if you contact them via email they can't remember or otherwise reproduce how they did it, which renders their results basically irreproducible).

like image 122
Daniel Avatar answered Oct 29 '22 17:10

Daniel