What I'm looking for is advice from people with experience with computer vision on what approach or algorithm would be best on this particular problem. I'm an experienced programmer (mostly .NET), but I know next to nothing about computer vision and I want to save time.
I would prefer an algorithm that doesn't need a large training set.
What I want to detect:
Distinctive colors, sharp edges, lack of gradients, and very little noise.
I envision the end result to be something like what Picasa or Windows Live Gallery does - I mark a pony in a few images and the program finds other images containing the same pony.
Cartoonists take particularly strong license in their drawings vs. an unretouched photo. Thus trying to identify Pinkie Pie by color doesn't do a lot of good in a frame where she has fallen into a vat of black paint. Or you might think you could identify Rarity by her horn, but consider the episode where she wishes she could be a regular pony...but after losing her horn she learns a lesson about being oneself.
True. So true.
This means depending on what you're trying to do here and the scale of it, it may make sense to provide an interface to a crowdsourcing system. If you haven't seen the white glove project, you might find some inspiration in that:
http://whiteglovetracking.com/
It doesn't have to be all automatic or manual, though. You could use a combination of techniques, and bring in human editors whenever there's a threshold of uncertainty.
As for designing a heuristic, it seems the place to start in getting a sense of where ponies are is to look for the eyes. Beginning with a search for "pony shaped things" might be a bit of a lost cause...especially if these are frames from a cartoon which might have close ups. In fact, looking at just your example here, the unicorn is just a head!
The next step I'd suggest would be to look in a certain radius around the eyes for color blocks matching hair and body. All of the My Little Ponies in my collection have unique hair and body colors, and...wait...I mean I don't know if My Little Pony characters have unique color combinations or not!! But they probably do.
Once you've intuited the pony's color fingerprint you can then search further and probably get a bounding box by using something like a flood-fill algorithm, assuming ponies are single polygons with no holes. Once again the eyes can give you a good idea of how large the pony will likely be in the picture, but once again cartoonists can break that expectation at any moment. Plus ponies close their eyes or blink etc. so anything you do here is going to need vetting.
(Note: If you've got an entire video stream, you could conceivably use inter-frame analysis to finesse issues of blinking. More generally, it is probably the case that the ponies are the "most animated" things in most otherwise static frames--that may bolster your confidence in a heuristic for finding them.)
But whatever you do choose to do...remember that Friendship Is Magic--and so is Image Recognition!!
HostileFork has provided a great answer, but as soon as I read your question it reminded me of pyimagesearch.com
as this example shows.
This particular blog is about a novice learning image recognition, and shows their first project.
They manage to extract the black shapes from this image:
Another good example is this post which shows using Haar Cascades to detect cat faces. Here is the OpenCV tutorial on training Haar Cascades
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With