Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Distance between hyperplanes

I'm trying to teach myself some machine learning, and have been using the MNIST database (http://yann.lecun.com/exdb/mnist/) do so. The author of that site wrote a paper in '98 on all different kinds of handwriting recognition techniques, available at http://yann.lecun.com/exdb/publis/pdf/lecun-98.pdf.

The 10th method mentioned is a "Tangent Distance Classifier". The idea being that if you place each image in a (NxM)-dimensional vector space, you can compute the distance between two images as the distance between the hyperplanes formed by each where the hyperplane is given by taking the point, and rotating the image, rescaling the image, translating the image, etc.

I can't figure out enough to fill in the missing details. I understand that most of these are indeed linear operators, so how does one use that fact to then create the hyperplane? And once we have a hyperplane, how do we take its distance with other hyperplanes?

like image 435
michael dillard Avatar asked Nov 12 '12 08:11

michael dillard


1 Answers

I will give you some hints. You need some background knowledge in image processing. Please refer to 2,3 for details.

  • 2 is a c implementation of tangent distance
  • 3 is a paper that describes tangent distance in more details

Image Convolution

According to 3, the first step you need to do is to smooth the picture. Below we show the result of 3 different smooth operations (check section 4 of 3) (The left column shows the result images, the right column shows the original images and the convolution operators). This step is to map the discrete vector to continuous one so that it is differentiable. The author suggests to use a Gaussian function. If you need more background about image convolution, here is an example.

enter image description here

After this step is done, you have calculated the horizontal and vertical shift:

enter image description hereenter image description here

Calculating Scaling Tangent

Here I show you one of the tangent calculations implemented in 2 - the scaling tangent. From 3, we know the transformation is as below:

enter image description here

/* scaling */
for(k=0;k<height;k++)
  for(j=0;j<width;j++) {
    currentTangent[ind] = ((j+offsetW)*x1[ind] + (k+offsetH)*x2[ind])*factor;
    ind++;
  }

In the beginning of td.c in 2's implementation, we know the below definition:

factorW=((double)width*0.5);
offsetW=0.5-factorW;
factorW=1.0/factorW;

factorH=((double)height*0.5);
offsetH=0.5-factorH;
factorH=1.0/factorH;

factor=(factorH<factorW)?factorH:factorW; //min

The author is using images with size 16x16. So we know

factor=factorW=factorH=1/8, 

and

offsetH=offsetW = 0.5-8 = -7.5

Also note we already computed

  • x1[ind] = ,
  • x2[ind] =

So that, we plug in those constants:

currentTangent[ind] = ((j-7.5)*x1[ind] + (k-7.5)*x2[ind])/8
                    = x1 * (j-7.5)/8 + x2 * (k-7.5)/8.

Since j(also k) is an integer between 0 and 15 inclusive (the width and the height of the image are 16 pixels), (j-7.5)/8 is just a fraction number between -0.9375 to 0.9375.

So I guess (j+offsetW)*factor is the displacement for each pixel, which is proportional to the horizontal distance from the pixel to the center of the image. Similarly you know the vertical displacement (k+offsetH)*factor.

Calculating Rotation Tangent

Rotation tangent is defined as below in 3:

enter image description here

/* rotation */
for(k=0;k<height;k++)
  for(j=0;j<width;j++) {
    currentTangent[ind] = ((k+offsetH)*x1[ind] - (j+offsetW)*x2[ind])*factor;
    ind++;
  }

Using the conclusion from previous, we know (k+offsetH)*factor corresponds to y. Similarly - (j+offsetW)*factor corresponds to -x. So you know that is exactly the formula used in 3.

You can find all other tangents described in 3 implemented at 2. I like the below image from 3, which clearly shows the displacements effect of different transformation tangents. enter image description here

Calculating the tangent distance between images

Just follow the implementation in tangentDistance function:

// determine the tangents of the first image
calculateTangents(imageOne, tangents, numTangents, height, width, choice, background);

// find the orthonormal tangent subspace 
numTangentsRemaining = normalizeTangents(tangents, numTangents, height, width);

// determine the distance to the closest point in the subspace
dist=calculateDistance(imageOne, imageTwo, (const double **) tangents, numTangentsRemaining, height, width);

I think the above should be enough to get you started and if anything is missing, please read 3 carefully and see corresponding implementations in 2. Good luck!

like image 82
greeness Avatar answered Nov 11 '22 23:11

greeness