Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to detect object on images?

I need python solution.

I have 40-60 images (Happy Holiday set). I need to detect object on all these images.

I don't know object size, form, location on image, I don't have any object template. I know only one thing: this object is present in almost all images. I called it UFO.

Example: enter image description hereenter image description hereenter image description hereenter image description here

As seen in example, from image to image everything changes except UFO. After detection I need to get:

X coordinate of the top left corner

Y coordinate of the top left corner

width of blue object region (i marked region on example as red rectangle)

height of blue object region

like image 847
Alex Avatar asked Feb 08 '13 07:02

Alex


People also ask

Can AI identify objects in images?

Image recognition only outputs a class label for an identified object, and image segmentation creates a pixel-level understanding of a scene's elements. What separates object detection from these other tasks is its unique ability to locate objects within an image or video.

What is object recognition in image processing?

Object recognition is a computer vision technique for identifying objects in images or videos. Object recognition is a key output of deep learning and machine learning algorithms. When humans look at a photograph or watch a video, we can readily spot people, objects, scenes, and visual details.

What is object detection in computer vision?

Object detection is a computer vision problem. While closely related to image classification, object detection performs image classification at a more granular scale. Object detection both locates and categorizes entities within images. Object detection models are commonly trained using deep learning and neural networks.

What is the difference between image classification and object detection?

While image classifications (e.g: CNN stacks) worked really well to classify images, it still has limitations to detect multiple objects of interest within an image and understanding how they are placed within the image. Object detections took one step further from image classifications, drawing one or many bounding boxes of objects of interest.

What is object detection in OpenCV?

Note: For more information, refer to Introduction to OpenCV. Object Detection is a computer technology related to computer vision, image processing, and deep learning that deals with detecting instances of objects in images and videos. We will do object detection in this article using something known as haar cascades.

How are object detection models trained?

Object detection models are commonly trained using deep learning and neural networks. See Deep learning vs machine learning for more information. Use object detection when images contain multiple objects of different types.


1 Answers

I create a second answer instead of extending my first answer even more. I use the same approach, but on your new examples. The only difference is: I use a set of fixed thresholds instead of determining it automatically. If you can play around with it, this should suffice.

import numpy as np
import PIL
import matplotlib.pyplot as plt
import glob

filenames = glob.glob("14767594/*.jpg")
images = [np.asarray(PIL.Image.open(fn)) for fn in filenames]
sample_images = np.concatenate([image.reshape(1,image.shape[0], image.shape[1],image.shape[2]) 
                            for image in images], axis=0)
                                                        
plt.figure(1)
for i in range(sample_images.shape[0]):
    plt.subplot(2,2,i+1)
    plt.imshow(sample_images[i,...])
    plt.axis("off")
plt.subplots_adjust(0,0,1,1,0,0)

# determine per-pixel variablility, std() over all images
variability = sample_images.std(axis=0).sum(axis=2)

# show image of these variabilities
plt.figure(2)
plt.imshow(variability, cmap=plt.cm.gray, interpolation="nearest", origin="lower")

# determine bounding box
thresholds = [5,10,20]
colors = ["r","b","g"]
for threshold, color in zip(thresholds, colors): #variability.mean()
    non_empty_columns = np.where(variability.min(axis=0)<threshold)[0]
    non_empty_rows = np.where(variability.min(axis=1)<threshold)[0]
    boundingBox = (min(non_empty_rows), max(non_empty_rows), min(non_empty_columns), max(non_empty_columns))
    
    # plot and print boundingBox
    bb = boundingBox
    plt.plot([bb[2], bb[3], bb[3], bb[2], bb[2]],
             [bb[0], bb[0],bb[1], bb[1], bb[0]],
             "%s-"%color, 
             label="threshold %s" % threshold)
    print boundingBox

plt.xlim(0,variability.shape[1])
plt.ylim(variability.shape[0],0)
plt.legend()

plt.show()

Produced plots:

Input imagesOutputs

Your requirements are closely related to ERP in cognitive neuroscience. The more input images you have, the better this approach will work as the signal-to-noise ratio increases.

like image 82
Thorsten Kranz Avatar answered Oct 02 '22 00:10

Thorsten Kranz