I am just being adventurous and taking my first baby step toward computer vision. I tried to implement the Hough Transformation on my own but I just don't get the whole picture. I read the wikipedia entry, and even the original "use of the hough transformation to detect lines and curves in pictures" by richard Duda and Peter Hart, but didn't help.
Can someone help explaining to me in a more friendly language?
The Hough transform is a feature extraction technique used in image analysis, computer vision, and digital image processing. The purpose of the technique is to find imperfect instances of objects within a certain class of shapes by a voting procedure.
One way of reducing the computation required to perform the Hough transform is to make use of gradient information which is often available as output from an edge detector. In the case of the Hough circle detector, the edge gradient tells us in which direction a circle must lie from a given edge coordinate point.
If two edge points lay on the same line, their corresponding cosine curves will intersect each other on a specific (ρ, θ) pair. Thus, the Hough Transform algorithm detects lines by finding the (ρ, θ) pairs that have a number of intersections larger than a certain threshold.
The circle Hough Transform (CHT) is a basic feature extraction technique used in digital image processing for detecting circles in imperfect images. The circle candidates are produced by “voting” in the Hough parameter space and then selecting local maxima in an accumulator matrix.
Here's a very basic, visual explanation of how a Hough Transform works for detecting lines in an image:
It's more common to think of a line in rectangle coordinates, i.e. y = mx + b. As the Wikipedia article states, a line can also be expressed in polar form. The Hough transform exploits this change of representation (for lines, anyway. The discussion can also be applied to circles, ellipses, etc.).
The first step in the Hough transform is to reduce the image to a set of edges. The Canny edge-detector is a frequent choice. The resulting edge image serves as the input to the Hough process.
To summarize, pixels "lit" in the edge image are converted to polar form, i.e. their position is represented using a direction theta and a distance r - instead of x and y. (The center of the image is commonly used as the reference point for this change of coordinates.)
The Hough transform is essentially a histogram. Edge pixels mapping to the same theta and r are assumed to define a line in the image. To compute the frequency of occurrence, theta and r are discretized (partitioned into a number of bins). Once all edge pixels have been converted to polar form, the bins are analyzed to determine the lines in the original image.
It is common to look for the N most frequent parameters - or threshold the parameters such that counts smaller than some n are ignored.
I'm not sure this answer is any better than the sources you originally presented - is there a particular point that you are stuck on?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With