Algorithm to determine size of a room from video feed

Question

Does anybody know of a image analysis algorithm with which I can determine how large(approximately, in real-life measurements, let's say width in meters or something) a room is out of one(or multiple) video recordings of this room?

I'm currently using OpenCV as my image library of choice, but I haven't gotten very far in terms of learning image analysis algorithms, just a name drop would be fine.

Thanks

Edit: Okay, a little bit of clarification I just got from the people involved. I basically have no control how the video feed is taken, and can't guarantee that there are multiple data sources. I however have a certain points location in the room and I'm supposed to place something in relation to that point. So I would probably looking at trying to identify the edges of the room, then identifying how far procentual the given location is in the room and then guess how large the room is.

Miguel · Accepted Answer

Awfully difficult (yet interesting!) problem.

If you are thinking in doing this in a completely automated way I think you'll have a lots of issues. But I think this is doable if an operator can mark control points in a set of pictures.

Your problem can be stated more generally as finding the distance between two points in 3D space, when you only have the locations of these points in two or more 2D pictures taken from different points of view. The process will work more or less like this:

The pictures will come with camera location and orientation information. For example, let's say that you get two pictures, with the same camera orientation and where the two pictures were taken with the camera three feet apart horizontally. You will have to define a reference origin for the 3D space in which the cameras are located, for example, you can say that the left picture has the camera at (0,0,0) and the right picture at (3,0,0), and both will be facing forward, which would be an orientation of (0,0,1). Or something like that.
Now the operator comes and marks the two corners of the room in both pictures. So you have 2 sets of 2D coordinates for each corner.
You must know the details of your camera and lens (field of view, lens distortion, aberrations, etc.). The more you know about how your camera deforms the image the more accurate you can make your estimate. This is the same stuff panorama stitching software do to achieve a better stitch. See PanoTools for info on this.
Here comes the fun part: you will now do the inverse of a perspective projection for each of your 2D points. The perspective projection takes a point in 3D space and a camera definition and computes a 2D point. This is used to represent tridimensional objects in a flat surface, like a computer screen. You are doing the reverse of that, for each 2D point you will try to obtain a 3D coordinate. Since there isn't enough information in a 2D point to determine depth, the best you can do from a single 2D point is obtain a line in 3D space that passes through the lens and through the point in question, but you don't know how far from the lens the point is. But you have the same 2D point in two images, so you can compute two 3D lines from different camera locations. These lines will not be parallel, so they will intersect at a single point. The intersection point of the 3D lines will be a good estimation of the location of the 3D point in space, and in the coordinates of your reference camera 3D space.
The rest is easy. When you have the estimated 3D locations of the two points of interest, you just compute the 3D distance between them, and that's the number that you want.

Pretty easy, huh?

Algorithm to determine size of a room from video feed

Tags:

algorithm

image-processing

opencv

video-processing

fk2

1 Answers

Miguel

Recent Activity

Donate For Us

Algorithm to determine size of a room from video feed

Tags:

algorithm

image-processing

opencv

video-processing

fk2

1 Answers

Miguel

Related questions

Recent Activity

Donate For Us