How to understand the average pixel number described in the frequency image?

Tags:

I'm trying to implement the widely used fingerprint image enhancement algorithm proposed by Anil Jain et al. While implementing the steps for ridge frequency image calculation in Section 2.5, I have difficulties in understanding some description. The steps are described as follows:

Obtain normalized image G.
Divide G into blocks of size w x w (16 x 16).
For each block centered at the pixel (i, j), compute an oriented window of size l x w (32x16) that is defined in the ridge coordinate system.
For each block centered at pixel (i,j), compute the x-signature, X[0], X1, ..., X[l-1], of the ridges and valleys within the oriented window, where

enter image description here

If no minutiae and singular points appear in the oriented window, the x-signature forms a discrete sinusoidal-shape wave, which has the same frequency as that of the ridges and valleys int he oriented window. Therefore, the frequency of ridges and valleys can be estimated from the x-signature. Let T(i,j) be the average number of pixels between two consecutive peaks in the x-signature, then the frequency is computed as:

enter image description here

My question is: I don't understand how to get the average number of pixels between two consecutive peaks, since the paper didn't mention how to differentiate peaks within the algorithm. So, how to decide those peak pixels to count them? Could someone explain me what did I miss here?

Besides, I implemented the steps up-to-here using OpenCV like this, I would really appreciate if someone could go through my steps to help me double check that I am implementing correctly:

void Enhancement::frequency(cv::Mat inputImage, cv::Mat orientationMat)
{
    int blockSize = 16;
    int windowSize = 32;

    //compute x-signature
    for (int i = blockSize / 2; i < inputImage.rows - blockSize / 2; i += blockSize)
    {
        for (int j = blockSize / 2; j < inputImage.cols - blockSize / 2; j += blockSize)
        {
            int u = 0; 
            int v = 0;
            std::vector<float> xSignature;

            for (int k = 0; k < windowSize; k++)            
            {
                float sum = 0.0;

                for (int d = 0; d < blockSize; d++)
                {
                    float pixel = orientationMat.at<float>(i, j);

                    u = i + (d - 0.5 * blockSize) * cos(pixel) + (k - 0.5 * windowSize) * sin(pixel);
                    v = j + (d - 0.5 * blockSize) * sin(pixel) + (0.5 - windowSize) * cos(pixel);
                    sum += static_cast<float>(inputImage.at<uchar>(u, v));
                }

                xSignature.push_back(sum);
            }
        } // end of j-loop
    } // end of i-loop

}

Update

After searching some articles, I found someone mentioned about how to determine whether a peak pixel like this:

Perform grayscale dilation on each block
Find where the dilation equals original values

But still, I didn't understand it clearly. Does that mean I can employ block-wise morphological dilation operation on my grayscale image (I've already converted my image from RGB to Grayscale in OpenCV before further processing) ? Does the word dilation equals original values means the pixel intensity after morphological dilation equals its original value? I'm lost here.

250

asked Jan 30 '15 07:01

E_learner

Video Answer

1 Answers

I do not know the specific algorithm you are talking about, but maybe I can offer some general advice.

I guess the core of the problem is the distinction "what is a peak, what is just noise" in a noisy signal (since RL input images are always noisy in some sense; I think the relevant input vector for peak detection in your code is xSignature). Once you have determined the peaks, calculating an average peak distance should be fairly straightforward.

As for peak detection, there are tons of papers describing quite sophisticated algorithms, but I'll outline some tried and true methods I'm using in my image processing job.

Smoothing

If you know the expected peak width w, you can as a first step apply some smoothing that gets rid of noise on a smaller scale by just summing over a window of about the expected peak width (from x-w/2 to x+w/2). You don't actually need to calculate the average value of the sliding window (divide by w), since for peak detection the absolute scale is irrelevant and the sum is proportional to the average value.

Min-Max-Identification

You can run over your (potentially smoothed) profile vector and identify minimum and maximum indices (e.g. by simple slope sign change). Store these positions in a map<int (coordinate), bool (isMax)> or map<int (coordinate), double (value at coordinate)>. Or use a struct as value that holds all the relevant info (bool isMax, double value, bool isAtBoundary, ...)

Evaluate quality of detected peaks

For each maximum you found in the previous step, determine the height difference and maybe the slope to both the previous and the following minimum, resulting in a quality. This step depends on your problem domain. Maybe "peaks" need not be framed by a minimum value on both sides (in that case, your minimum detection above would have to be more sophisticated than slope change). Maybe there are min or max width restrictions on peaks. And so on.

Calculate a quality value based on the above questions for each maximum position. I often use something like Q_max = (average height difference from max to neighboring mins) / (max-min of profile). A peak candidate can then at most have a "quality" of 1, and at least 0.

Iterate over all your maxima, calculate their qualities and put them into a multimap or some other container, that can be sorted so that you can later iterate over your peaks in descending quality.

Distinguish peaks from non-peaks

Iterate over your peaks in descending quality. Possibly sort out all that do not fulfill minimum or maximum width/height/quality/distance to nearest peak with higher quality/... requirements for them to be peaks in your problem domain. Keep the rest. Done.

In your case, you would then reorder the peaks by coordinate and calculate the average distance between them.

I know this is vague, but there are no universally true answers for peak detection. Maybe in the paper you are working with there is a specific prescription hidden somewhere, but most authors omit such "mere technicalities" (and typically, if you contact them via email they can't remember or otherwise reproduce how they did it, which renders their results basically irreproducible).

122

answered Oct 29 '22 17:10

Daniel

Related questions
                            
                                Qt #define "signals" clashes with GStreamer (gst)
                            
                                QSharedDataPointer with forward-declared class
                            
                                Can libuv(node.js's async lib) run on Apple IOS / Android?
                            
                                Check if type can be an argument to boost::lexical_cast<string>
                            
                                Executing java file in qt
                            
                                Is masking effective for thwarting side channel attacks?
                            
                                Only relink shared libraries when headers change in CMake
                            
                                Does the multibyte-to-wide-string conversion function "mbstowcs", when passed a string literal, use the encoding of the source file?
                            
                                How to define custom float-point format (type) in C++?
                            
                                Eclipse can't find header filers even though include paths have been set
                            
                                What does "_dyld_start" mean in my profiling results?
                            
                                What is the best suited encoding for C++ source code
                            
                                C++ Input stream: operation order in Solaris vs. Linux
                            
                                Best way to include stdafx.h, when it is 1 directory up?
                            
                                Why the operands of an operator needs to be of the same type?
                            
                                Passing sub-vector as function argument in c++
                            
                                What is the equivalent of a Win32 message pump in Linux?
                            
                                How long does thread creation and termination take under Windows?
                            
                                C++ map insertion and lookup performance and storage overhead
                            
                                Compiler error C2653: not a class or namespace name

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to understand the average pixel number described in the frequency image?

Tags:

c++

algorithm

image

image-processing

opencv