Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Object detection + segmentation

I 'm trying to find an efficient way of acceptable complexity to

  • detect an object in an image so I can isolate it from its surroundings
  • segment that object to its sub-parts and label them so I can then fetch them at will

It's been 3 weeks since I entered the image processing world and I've read about so many algorithms (sift, snakes, more snakes, fourier-related, etc.), and heuristics that I don't know where to start and which one is "best" for what I'm trying to achieve. Having in mind that the image dataset in interest is a pretty large one, I don't even know if I should use some algorithm implemented in OpenCV or if I should implement one my own.

Summarize:

  • Which methodology should I focus on? Why?
  • Should I use OpenCV for that kind of stuff or is there some other 'better' alternative?

Thank you in advance.

EDIT -- More info regarding the datasets

Each dataset consists of 80K images of products sharing the same

  • concept e.g. t-shirts, watches, shoes
  • size
  • orientation (90% of them)
  • background (95% of them)

All pictures in each datasets look almost identical apart from the product itself, apparently. To make things a little more clear, let's consider only the 'watch dataset':

All the pictures in the set look almost exactly like this:

enter image description here

(again, apart form the watch itself). I want to extract the strap and the dial. The thing is that there are lots of different watch styles and therefore shapes. From what I've read so far, I think I need a template algorithm that allows bending and stretching so as to be able to match straps and dials of different styles.

Instead of creating three distinct templates (upper part of strap, lower part of strap, dial), it would be reasonable to create only one and segment it into 3 parts. That way, I would be confident enough that each part was detected with respect to each other as intended to e.g. the dial would not be detected below the lower part of the strap.

From all the algorithms/methodologies I've encountered, active shape|appearance model seem to be the most promising ones. Unfortunately, I haven't managed to find a descent implementation and I'm not confident enough that that's the best approach so as to go ahead and write one myself.

If anyone could point out what I should be really looking for (algorithm/heuristic/library/etc.), I would be more than grateful. If again you think my description was a bit vague, feel free to ask for a more detailed one.

like image 861
sawidis Avatar asked Aug 28 '11 13:08

sawidis


People also ask

What is segmentation in object detection?

Segmentation is a type of labeling where each pixel in an image is labeled with given concepts. Here, whole images are divided into pixel groupings which can then be labeled and classified, with the goal of simplifying an image or changing how an image is presented to the model, to make it easier to analyze.

How is segmentation different from object detection?

So, the difference between instance segmentation and object detection techniques is that object detectors only detect objects in images. Conversely, instance segmentation solutions provide a fine-grained understanding of image data by defining and classifying each instance present in visual input.

What is object detection and instance segmentation?

Whereas an object detection system coarsely localizes multiple objects with bounding boxes and a semantic segmentation framework produces pixel-level category labels for each category class, Instance Segmentation produces a segment map of each category as well as each instance of a particular class—therefore, providing ...

Is semantic segmentation object detection?

Semantic segmentation can be used for polygon object detection by breaking an image down into its parts and then classifying them. Thus, the computer can better understand the individual elements that make up an image.


1 Answers

From what you've said, here are a few things that pop up at first glance:

  • Simplest thing to do it binarize the image and do Connected Components using OpenCV or CvBlob library. For simple images with non-complex background this usually yeilds objects

  • HOwever, looking at your sample image, texture-based segmentation techniques may work better - the watch dial, the straps and the background are wisely variant in texture/roughness, and this could be an ideal way to separate them.

    The roughness of a portion can be easily found by the Eigen transform (explained a bit on SO, check the link to the research paper provided there), then the Mean Shift filter can be applied on the output of the Eigen transform. This will give regions clearly separated according to texture. Both the pyramidal Mean Shift and finding eigenvalues by SVD are implemented in OpenCV, so unless you can optimize your own code its better (and easier) to use inbuilt functions (if present) as far as speed and efficiency is concerned.

like image 139
AruniRC Avatar answered Sep 28 '22 09:09

AruniRC