This was bugging me over the weekend: What is a good way to solve those Where's Waldo? ['Wally' outside of North America] puzzles, using Mathematica (image-processing and other functionality)?
Here is what I have so far, a function which reduces the visual complexity a little bit by dimming some of the non-red colors:
whereIsWaldo[url_] := Module[{waldo, waldo2, waldoMask},
    waldo = Import[url];
    waldo2 = Image[ImageData[
        waldo] /. {{r_, g_, b_} /;
          Not[r > .7 && g < .3 && b < .3] :> {0, 0,
          0}, {r_, g_, b_} /; (r > .7 && g < .3 && b < .3) :> {1, 1,
          1}}];
    waldoMask = Closing[waldo2, 4];
    ImageCompose[waldo, {waldoMask, .5}]
]
And an example of a URL where this 'works':
whereIsWaldo["http://www.findwaldo.com/fankit/graphics/IntlManOfLiterature/Scenes/DepartmentStore.jpg"]
(Waldo is by the cash register):


Players take turns flipping over cards and trying to spot Waldo. The first player to spot Waldo scores a point and the first to score five points wins the game. That is basically all there is to the game. The game boils down to a very basic find the item game where players race to be the first to find Waldo.
Waldo seems to prefer to hide on the upper quarter of the right page. Next check the bottom right half of the right page. Waldo also has an aversion to the bottom left half of the right page. Don't bother looking there until you've exhausted the other hot spots.
The gameplay in Waldo and Friends for Android is identical to the books: you have to find Waldo and other things within a static scene filled with colorful items and other characters who look very similar to him. If you can find all the items quickly, you'll get a better score.
To start the search, simply update your app or visit google.com/maps on desktop. Then press play when you see me waving at you from the side of your screen. You can even ask the Google Assistant on your phone, Chromebook or Home device, “Hey Google, Where's Waldo?” to start. The fun doesn't stop there.
I've found Waldo!

How I've done it
First, I'm filtering out all colours that aren't red
waldo = Import["http://www.findwaldo.com/fankit/graphics/IntlManOfLiterature/Scenes/DepartmentStore.jpg"];
red = Fold[ImageSubtract, #[[1]], Rest[#]] &@ColorSeparate[waldo];
Next, I'm calculating the correlation of this image with a simple black and white pattern to find the red and white transitions in the shirt.
corr = ImageCorrelate[red, 
   Image@Join[ConstantArray[1, {2, 4}], ConstantArray[0, {2, 4}]], 
   NormalizedSquaredEuclideanDistance];
I use Binarize to pick out the pixels in the image with a sufficiently high correlation and draw white circle around them to emphasize them using Dilation
pos = Dilation[ColorNegate[Binarize[corr, .12]], DiskMatrix[30]];
I had to play around a little with the level. If the level is too high, too many false positives are picked out.
Finally I'm combining this result with the original image to get the result above
found = ImageMultiply[waldo, ImageAdd[ColorConvert[pos, "GrayLevel"], .5]]
My guess at a "bulletproof way to do this" (think CIA finding Waldo in any satellite image any time, not just a single image without competing elements, like striped shirts)... I would train a Boltzmann machine on many images of Waldo - all variations of him sitting, standing, occluded, etc.; shirt, hat, camera, and all the works. You don't need a large corpus of Waldos (maybe 3-5 will be enough), but the more the better.
This will assign clouds of probabilities to various elements occurring in whatever the correct arrangement, and then establish (via segmentation) what an average object size is, fragment the source image into cells of objects which most resemble individual people (considering possible occlusions and pose changes), but since Waldo pictures usually include a LOT of people at about the same scale, this should be a very easy task, then feed these segments of the pre-trained Boltzmann machine. It will give you probability of each one being Waldo. Take one with the highest probability.
This is how OCR, ZIP code readers, and strokeless handwriting recognition work today. Basically you know the answer is there, you know more or less what it should look like, and everything else may have common elements, but is definitely "not it", so you don't bother with the "not it"s, you just look of the likelihood of "it" among all possible "it"s you've seen before" (in ZIP codes for example, you'd train BM for just 1s, just 2s, just 3s, etc., then feed each digit to each machine, and pick one that has most confidence). This works a lot better than a single neural network learning features of all numbers.
I agree with @GregoryKlopper that the right way to solve the general problem of finding Waldo (or any object of interest) in an arbitrary image would be to train a supervised machine learning classifier. Using many positive and negative labeled examples, an algorithm such as Support Vector Machine, Boosted Decision Stump or Boltzmann Machine could likely be trained to achieve high accuracy on this problem. Mathematica even includes these algorithms in its Machine Learning Framework.
The two challenges with training a Waldo classifier would be:
A quick Google image search turns up some good data -- I'm going to have a go at collecting some training examples and coding this up right now!
However, even a machine learning approach (or the rule-based approach suggested by @iND) will struggle for an image like the Land of Waldos!
I don't know Mathematica . . . too bad. But I like the answer above, for the most part.
Still there is a major flaw in relying on the stripes alone to glean the answer (I personally don't have a problem with one manual adjustment). There is an example (listed by Brett Champion, here) presented which shows that they, at times, break up the shirt pattern. So then it becomes a more complex pattern.
I would try an approach of shape id and colors, along with spacial relations. Much like face recognition, you could look for geometric patterns at certain ratios from each other. The caveat is that usually one or more of those shapes is occluded.
Get a white balance on the image, and red a red balance from the image. I believe Waldo is always the same value/hue, but the image may be from a scan, or a bad copy. Then always refer to an array of the colors that Waldo actually is: red, white, dark brown, blue, peach, {shoe color}.
There is a shirt pattern, and also the pants, glasses, hair, face, shoes and hat that define Waldo. Also, relative to other people in the image, Waldo is on the skinny side.
So, find random people to obtain an the height of people in this pic. Measure the average height of a bunch of things at random points in the image (a simple outline will produce quite a few individual people). If each thing is not within some standard deviation from each other, they are ignored for now. Compare the average of heights to the image's height. If the ratio is too great (e.g., 1:2, 1:4, or similarly close), then try again. Run it 10(?) of times to make sure that the samples are all pretty close together, excluding any average that is outside some standard deviation. Possible in Mathematica?
This is your Waldo size. Walso is skinny, so you are looking for something 5:1 or 6:1 (or whatever) ht:wd. However, this is not sufficient. If Waldo is partially hidden, the height could change. So, you are looking for a block of red-white that ~2:1. But there has to be more indicators.
Any of those could apply. These are also negative checks against similar people in the pic -- e.g., #2 negates wearing a red-white apron (too close to shoes), #5 eliminates light colored hair. Also, shape is only one indicator for each of these tests . . . color alone within the specified distance can give good results.
This will narrow down the areas to process.
Storing these results will produce a set of areas that should have Waldo in it. Exclude all other areas (e.g., for each area, select a circle twice as big as the average person size), and then run the process that @Heike laid out with removing all but red, and so on.
Any thoughts on how to code this?
Edit:
Thoughts on how to code this . . . exclude all areas but Waldo red, skeletonize the red areas, and prune them down to a single point. Do the same for Waldo hair brown, Waldo pants blue, Waldo shoe color. For Waldo skin color, exclude, then find the outline.
Next, exclude non-red, dilate (a lot) all the red areas, then skeletonize and prune. This part will give a list of possible Waldo center points. This will be the marker to compare all other Waldo color sections to.
From here, using the skeletonized red areas (not the dilated ones), count the lines in each area. If there is the correct number (four, right?), this is certainly a possible area. If not, I guess just exclude it (as being a Waldo center . . . it may still be his hat).
Then check if there is a face shape above, a hair point above, pants point below, shoe points below, and so on.
No code yet -- still reading the docs.
I have a quick solution for finding Waldo using OpenCV.
I used the template matching function available in OpenCV to find Waldo.
To do this a template is needed. So I cropped Waldo from the original image and used it as a template.

Next I called the cv2.matchTemplate() function along with the normalized correlation coefficient as the method used. It returned a high probability at a single region as shown in white below (somewhere in the top left region):

The position of the highest probable region was found using cv2.minMaxLoc() function, which I then used to draw the rectangle to highlight Waldo:

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With