Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How a marker-based augmented reality algorithm (like ARToolkit's one) works?

For my job i've been using a Java version of ARToolkit (NyARTookit). So far it proven good enough for our needs, but my boss is starting to want the framework ported in other platforms such as web (Flash, etc) and mobiles. While i suppose i could use other ports, i'm increasingly annoyed by not knowing how the kit works and beyond that, from some limitations. Later i'll also need to extend the kit's abilities to add stuff like interaction (virtual buttons on cards, etc), which as far as i've seen in NyARToolkit aren't supported.

So basically, i need to replace ARToolkit with a custom mark detector (and in case of NyARToolkit, try to get rid of JMF and use a better solution via JNI). However i don't know how these detectors work. I know about 3D graphics and i've built a nice framework around it, but i need to know how to build the underlying tech :-).

Does anyone know any sources about how to implement a marker-based augmented reality application from scratch? When searching in google i only find "applications" of AR, not the underlying algorithms :-/.

like image 246
Bad Sector Avatar asked Feb 10 '10 11:02

Bad Sector


1 Answers

'From scratch' is a relative term. Truly doing it from scratch, without using any pre-existing vision code, would be very painful and you wouldn't do a better job of it than the entire computer vision community.

However, if you want to do AR with existing vision code, this is more reasonable. The essential sub-tasks are:

  1. Find the markers in your image or video.
  2. Make sure they are the ones you want.
  3. Figure out how they are oriented relative to the camera.

The first task is keypoint localization. Techniques for this include SIFT keypoint detection, the Harris corner detector, and others. Some of these have open source implementations - i think OpenCV has the Harris corner detector in the function GoodFeaturesToTrack.

The second task is making region descriptors. Techniques for this include SIFT descriptors, HOG descriptors, and many many others. There should be an open-source implementation of one of these somewhere.

The third task is also done by keypoint localizers. Ideally you want an affine transformation, since this will tell you how the marker is sitting in 3-space. The Harris affine detector should work for this. For more details go here: http://en.wikipedia.org/wiki/Harris_affine_region_detector

like image 162
forefinger Avatar answered Sep 19 '22 14:09

forefinger