I'm trying to teach myself some machine learning, and have been using the MNIST database (http://yann.lecun.com/exdb/mnist/) do so. The author of that site wrote a paper in '98 on all different kinds of handwriting recognition techniques, available at http://yann.lecun.com/exdb/publis/pdf/lecun-98.pdf.
The 10th method mentioned is a "Tangent Distance Classifier". The idea being that if you place each image in a (NxM)-dimensional vector space, you can compute the distance between two images as the distance between the hyperplanes formed by each where the hyperplane is given by taking the point, and rotating the image, rescaling the image, translating the image, etc.
I can't figure out enough to fill in the missing details. I understand that most of these are indeed linear operators, so how does one use that fact to then create the hyperplane? And once we have a hyperplane, how do we take its distance with other hyperplanes?
I will give you some hints. You need some background knowledge in image processing. Please refer to 2,3 for details.
c
implementation of tangent distanceAccording to 3, the first step you need to do is to smooth the picture. Below we show the result of 3 different smooth operations (check section 4 of 3) (The left column shows the result images, the right column shows the original images and the convolution operators). This step is to map the discrete vector to continuous one so that it is differentiable. The author suggests to use a Gaussian function. If you need more background about image convolution, here is an example.
After this step is done, you have calculated the horizontal and vertical shift:
Here I show you one of the tangent calculations implemented in 2 - the scaling tangent. From 3, we know the transformation is as below:
/* scaling */
for(k=0;k<height;k++)
for(j=0;j<width;j++) {
currentTangent[ind] = ((j+offsetW)*x1[ind] + (k+offsetH)*x2[ind])*factor;
ind++;
}
In the beginning of td.c
in 2's implementation, we know the below definition:
factorW=((double)width*0.5);
offsetW=0.5-factorW;
factorW=1.0/factorW;
factorH=((double)height*0.5);
offsetH=0.5-factorH;
factorH=1.0/factorH;
factor=(factorH<factorW)?factorH:factorW; //min
The author is using images with size 16x16. So we know
factor=factorW=factorH=1/8,
and
offsetH=offsetW = 0.5-8 = -7.5
Also note we already computed
x1[ind]
= , x2[ind]
=
So that, we plug in those constants:
currentTangent[ind] = ((j-7.5)*x1[ind] + (k-7.5)*x2[ind])/8
= x1 * (j-7.5)/8 + x2 * (k-7.5)/8.
Since j
(also k
) is an integer between 0 and 15 inclusive (the width and the height of the image are 16 pixels), (j-7.5)/8
is just a fraction number between -0.9375
to 0.9375
.
So I guess (j+offsetW)*factor
is the displacement for each pixel, which is proportional to the horizontal distance from the pixel to the center of the image. Similarly you know the vertical displacement (k+offsetH)*factor
.
Rotation tangent is defined as below in 3:
/* rotation */
for(k=0;k<height;k++)
for(j=0;j<width;j++) {
currentTangent[ind] = ((k+offsetH)*x1[ind] - (j+offsetW)*x2[ind])*factor;
ind++;
}
Using the conclusion from previous, we know (k+offsetH)*factor
corresponds to y
. Similarly - (j+offsetW)*factor
corresponds to -x
. So you know that is exactly the formula used in 3.
You can find all other tangents described in 3 implemented at 2. I like the below image from 3, which clearly shows the displacements effect of different transformation tangents.
Just follow the implementation in tangentDistance
function:
// determine the tangents of the first image
calculateTangents(imageOne, tangents, numTangents, height, width, choice, background);
// find the orthonormal tangent subspace
numTangentsRemaining = normalizeTangents(tangents, numTangents, height, width);
// determine the distance to the closest point in the subspace
dist=calculateDistance(imageOne, imageTwo, (const double **) tangents, numTangentsRemaining, height, width);
I think the above should be enough to get you started and if anything is missing, please read 3 carefully and see corresponding implementations in 2. Good luck!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With