So imagine that there is a camera looking at your computer screen. What I am trying to do is determine how much that camera is rotated, how far away it is from the screen, and where it is relative to the screen's center. In short the rotation and translation matrices.
I'm using opencv to do this, and followed their camera calibration example to do this task with a checkerboard pattern and a frame from a webcam. I would like to do with any generic images, namely a screen cap and a frame from a webcam.
I've tried using the feature detection algorithms to get a list of keypoints from both images and then match those keypoints with a BFMatcher, but have run into problems. Specifically SIFT does not match keypoints correctly, and SURF does not find keypoints correctly on a scaled image.
Is there an easier solution to this problem? I feel that this would be a common thing that people have done, but have not found much discussion of it online.
Thanks!!
Finding natural planar markers is a common task in computer vision but in your case you have a screen which varies depending on what you are visualizing on the screen, it can be your desktop, your browser, a movie,...
So you cannot apply usual methods for marker detection, you should try shape recognition. An idea is trying a particle filter on a rectangular template of the same dimensions (through different scales) of your screen frame, applying first edge detection.
The particle filter will fit the template to the area of the frame. After doing this, you will know the position. For orientation you will need to calculate homography, and you need 4 points in the "marker" for this, so you can apply Direct Linear Transform (cv::findHomography() does this for you). So your four points can be the four corners. This is just an idea, good luck!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With