How to get the real life size of an object from an image, when not knowing the distance between object and the camera?

Tags:

computer-vision

I have to make a mobile app that calculates the real life size of an object in an image.

I have done some research on it and found helpful [question]: How would you find the height of objects given an image?

The relation of the distance of the camera and real life size of the object isn't actually that complex, the ratio of the size of the object on the sensor and the size of the object in real life is the same as the ratio between the focal length and distance to the object.

distance to object (mm) = focal length (mm) * real height of the object (mm) * image height (pixels)
                          ---------------------------------------------------------------------------
                          object height (pixels) * sensor height (mm)

But how to get the value of real height of the object if distance is not known ?

Do the tools that create 3d models from images have real life dimensions?

465

asked Mar 30 '12 09:03

Parth mehta

1 Answers

The simple answer is you can't.

Incidentally, this is why humans have two eyes. If you want to judge size without a known distance, you'll need at least two reference points. This allows you to triangulate the position of the object, get a distance to it, and use your known focal distance to calculate the size.

The more complex answer is there are ways around this for example:

Cheat by using a known reference:

For example, if you have an object of known size, you can infer the distance. This is similar to what NASA does to calibrate its cameras, for example.

You can make safe assumptions if you're dealing with common objects, such as the height of one storey when analysing the image of a building.
Move your camera around:

This allows you to get more than one reference point with the same camera.

I suppose you could use the accelerometer to accurately measure the positional relation between the image captured at point T1 in time and point T2. This would give you two images of the same subject with a known distance between them. This then allows you to triangulate as if you had two eyes.

Whether normal hand-held camera jitters will be sufficient for triangulation, or whether the accelerometer will be accurate enough to inertially position the phone, I don't know.
Assume a distance:

If your app is designed to compare something on the scale of a human hand (or other bit of human anatomy), you can probably safely assume a distance based on what people will naturally do. The focus limits of the camera itself will also give an upper and lower range on how far an object can be and still be in focus. This will probably be within a tolerable margin of error.

As you mention in your question, there is an entire subfield dedicated to this question, and it is an active research area.

179

answered Oct 16 '22 12:10

brice

Related questions
                            
                                Automatically add watermark to an image
                            
                                How to convert a 16 bit to an 8 bit image in OpenCV?
                            
                                How to use Opencv for Document Recognition with OCR?
                            
                                WPF - Zooming in on an image inside a scroll viewer, and having the scrollbars adjust accordingly
                            
                                Sum of elements in a matrix in OpenCV?
                            
                                Pytorch RuntimeError: The size of tensor a (4) must match the size of tensor b (3) at non-singleton dimension 0
                            
                                HTML5 Canvas image contrast
                            
                                What is the difference between sparse and dense optical flow?
                            
                                Stitching Photos together
                            
                                Remove EXIF data from JPG using PHP
                            
                                OpenCV - Saving images to a particular folder of choice
                            
                                compiling opencv in c++
                            
                                Go Resizing Images
                            
                                Is HSL Superior over HSI and HSV Color Spaces?
                            
                                Remove Kinect depth shadow
                            
                                In a digital photo, how can I detect if a mountain is obscured by clouds?
                            
                                Using EC2 to resize images stored on S3 on demand
                            
                                Getting text from image on ios (image processing)
                            
                                Efficient way to combine intersecting bounding rectangles
                            
                                Manually alpha blending an RGBA pixel with an RGB pixel

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With