Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bad Results with Basic Stereo Block Matching (without OpenCV)

I am trying to implement stereo block matching without using OpenCV or other image processing libraries. All tutorials, books, lecture slides etc. only teach the very basic approach to compare blocks in images but the results are very bad. I read some papers like the one from K.Konolige, which is the basis of the algorithm in OpenCV, but I still seem to miss something important.

What I am doing now:

  1. Apply Sobel to left and right image.
  2. Do block matching
    • Pick a (9x9) block around a pixel in the left image and compare with blocks in the same row of the right image (up to a maximum of 80 pixels right of the original block)
    • Find the one with the best match (using SAD sum of absolute differences)

The resulting disparity is how many steps I had to go right to find the best match.

After reading the Konolige paper I implemented the Left-Right-Check, which, after you have found your best match, you search that best match of the right image in the left image and only accept it if it is the one your originally searched for or right next to it.

Also added a check so that a pixel can only be matched once, using a bitfield pixels will be skipped in searches if they previously have been matched to a pixel.

The result doesn't look very wrong but very sparse.

What is it that I fail to add? Something everybody seems to know but isn't spelled out. Do I need to add some kind of interpolation?

Any help is appreciated!

My input is the Tsukuba Stereo Pair.

Result found on the web (2nd is OpenCV BM, 3rd apparently is SAD BM from Blog authors)

http://cseautonomouscar2012.files.wordpress.com/2012/11/111412_2001_comparisono1.png

like image 221
bmsob Avatar asked Oct 21 '22 19:10

bmsob


1 Answers

It is normal that your results are sparse, because your algorithm is sparse!

Let's rewind the story a bit:

  • in your first step, you apply a Sobel edge detector. What you do here is extracting a sparse set of features that are the edges of the image;
  • then you apply block matching on the result: what you actually do is hence matching edges, and thus matching of sparse features.

Classical BM implementations work on image intensity patches (this is why brightness equalization is important), i.e., take SSD/SAD/correlation of pixel intensities.

Also, BM works, but not so well with difficult images. Robust cost functions (such as normalized correlation) are often necessary instead of SAD. And be careful when you compare your results with OpenCV: OpenCV proposes antother BM implementation called SGBM (for Semi-Global BM). In this case, an additional term enforces that the disparity of neighbouring pixels is also close. This is called a regularity constraint and helps in two ways:

  1. it limits the noise in the output result (if the disparity of a pixel is an outlier, it is removed and replaced by a value inferred from its neighbours);
  2. it allows to propagate good results into areas where the algorithm has no clues to infer a good result. This is typically the case with edge matching: you get good disparity estimates on the set of edges, and you let the regularization term propagate this good estimation onto flat (texture-less and edge-less) areas.
like image 135
sansuiso Avatar answered Oct 29 '22 22:10

sansuiso