Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Playing perfect Tetris: how to align and scale two curves using scaling and translation?

Given parameters of scaling on the y axis (s) and translation on the x axis (t), how to scale and align two curves that do not coincide when the purpose is to maximize curve superposition (not minimize distance)?

As indicated by @DWin, this could be rebranded "How to play Tetris perfectly with R", though it does have applications far beyond winning a Tetris game.

A variation of this question could involve any number of rigid body transformations (rotation, translation and scaling).

Given curve 1

curve1<-data.frame(x=c(1,1,2,2,3),
                   y=c(9,6,6,3,3))

with(curve1, plot(x=x, y=y, type="l", xlim=c(0,10), ylim=c(0,10)))

enter image description here

and curve 2

curve2<-data.frame(x=c(4,5,5,6,6,7),
                   y=c(2,2,1,1,2,3))

with(curve2, plot(x=x, y=y, type="l", xlim=c(0,10), ylim=c(0,10)))

curve 2

I wish to find s and t that maximize superposition between the two curves.

Ideally the method would be in R using optim.

In this made up example, t=3 and s=1/3, so that

t=3
s=1/3

with(curve2, plot(x=x, y=y, type="l", xlim=c(0,10), ylim=c(0,10)))

with(curve1, lines(x=x+t, y=y*s, col="red"))

enter image description here

Note that to obtain such a fitting, regions that can have a consensus must have a higher weight on parametrization than regions that can not be superimposed and that the larger the consensus region the higher the weight.

Trails I have been exploring:

  • minimize area between curves
  • algorithm for superposing two curves
  • shape analysis using the shapes package
  • Iterative closest point (ICP) if someone has implemented it in R or can port one of the C implementations

    Bonus points for a method using maximum likelihood (assuming normal distribution of error).

  • like image 668
    Etienne Low-Décarie Avatar asked May 31 '12 02:05

    Etienne Low-Décarie


    2 Answers

    This will return the distances between the points when curve1 is scaled on the y-axis by a factor "tfac" and moved on the x-axis by an amount "s":

     as.matrix( dist( rbind(as.matrix(curve2), 
                     ( matrix(c(rep(s, 5), rep(1,5)), ncol=2) +   # x-shift matrix
                                           as.matrix(curve1) ) %*% 
                       matrix(c( 1, 0, 0, tfac),ncol=2) ) ) # the y-scaling matrix
                    )[ 
                               # better not to use 't' as a variable name
          -(1:5), -(6:11)]  # easier to return the relevant distances when in matrix
    

    It should be a simple matter to put that in a function to be minimized:

     dfunc <- function(C1, C2, s, tfac) { sum( .... ) }
    

    I'm not absolutely sure this will return the result you are expecting since the objective function you are implying may not be the sum of distances. You may need to turn to integer programming methods. The Optimization CRAN Task View would be a good place to look for those methods in R. I suppose an alternative if this problem arises might be to round the "s" value and scale only to the nearest power of 2.

    dfunc <- function(param, D1=curve1, D2=curve2) { 
               sum( as.matrix(dist( 
                       rbind(as.matrix(D2), 
                             ( matrix(c(rep(param[1], 5), rep(1,5)), ncol=2) + 
                               as.matrix(D1) ) %*% 
                       matrix(c(1,0,0,param[2]),ncol=2) ) ) )[-(1:5), -(6:11)])}
    optim(c(1,1), dfunc)
    #$par
    $[1] 3.3733977 0.2243866
    # trimmed output
    

    Using these values one obtains the following superposition: Superposition using above values

    I might, therefore, try s=3, tfac=0.25. (I see looking back that I switched the roles of t and s from what you asked for. Sorry.)

    like image 176
    IRTFM Avatar answered Sep 18 '22 13:09

    IRTFM


    Okay, here's an attempt at a solution.

    The basic trick is this: we rasterise the two curves, and then we can do the curve comparison on a tile by tile basis. This seems like a fairly reasonable way of comparing the curve superposition, at least. To encourage the optimiser to approach the curve, we also introduce a loss that penalises curves too far away from the other.

    No guarantees this works for more complicated curves and transformations, but it's an idea, at least.

     curve2<-data.frame(x=c(4,5,5,6,6,7),
                        y=c(2,2,1,1,2,3))
     fillin <- function(ax, ay, bx, by, scaling= 10, steps= 100) floor(cbind(seq(from = ax, to = bx, len = steps), seq(from = ay, to = by, len = steps)) * scaling)
     Bmat <- matrix(0, 100, 100)
     for (i in 2:nrow(curve2)){
     Bmat[fillin (curve2[i-1,1], curve2[i-1,2], curve2[i,1], curve2[i,2])] =1
     }
     Bmat.orig = Bmat
    
     Bmat = Bmat.orig
     #construct utility function based on 
     #manhattan distances to closest point?
     shift = function(mat, offset){
     mat0 = array(0, dim = dim(mat)+2)
     mat0[1:nrow(mat) +1+ offset[1] , 1:ncol(mat) + 1+offset[2]] = mat
     return(mat0[1:nrow(mat) + 1, 1:ncol(mat) + 1])
     }
    
     for (i in 1:100){
     Bm = (Bmat != 0)
     Btmp1 = shift(Bm, c(1,0))
     Btmp2 = shift(Bm, c(-1,0))
     Btmp3 = shift(Bm, c(0,1))
     Btmp4 = shift(Bm, c(0,-1))
    
     Bmat = Bmat + pmax(Bm ,Btmp1, Btmp2, Btmp3, Btmp4)/i
     }
    
     Bmat2 = replace(Bmat, Bmat == max(Bmat), max(Bmat) + 10)
    
     #construct and compare rasterised versions
     getcurve = function(trans = c(0,1),  curve=data.frame(x=c(1,1,2,2,3) ,
                        y=c(9,6,6,3,3) ), Bmat = Bmat2){
     Amat = array(0, dim = dim(Bmat))
     curve[,1] = curve[,1] + trans[1]
     curve[,2] = curve[,2] * trans[2]
     for (i in 2:nrow(curve)){
     fillin (curve[i-1,1], curve[i-1,2], curve[i,1], curve[i,2]) -> ind
     if (min(ind) < 1 || max(ind) > nrow(Bmat)) return( array(-1, dim= dim(Bmat)))
     Amat[ind] =1
     }
     Amat = (Amat - mean(Amat))/sd(as.vector(Amat))
     Amat
     }
     compcurve = function(trans = c(0,1), curve=data.frame(x=c(1,1,2,2,3) ,
                        y=c(9,6,6,3,3) ) , Bmat = Bmat2){
     Amat = getcurve(trans, curve, Bmat)
     -sum(Amat * Bmat)
     }
     #SANN seems to work for this, but is slow. Beware of finite differencing
     # - criterion is non-smooth! 
     optim(c(0,1), compcurve, method = "SANN", Bmat = Bmat2) -> output
     image(Bmat)
     contour(getcurve(output$par), add = T)
    

    Result:

    Fit in black, verses target criterion

    Not too shabby, maybe?

    You will possibly have to fudge the specifics of the rasterisation to work for other problems. And you might wanna tune how the optim is done.

    A 'smarter' alternative is to note that for an optimal solution, it's probably the case that at least one pair of vertices will conincide. This can give you a better search strategy. The advantage of the rasterisation scheme in comparison to area-between curves is that it's possibly more flexible wrt to different transformations and non-graphs (in particular, the vertical line in your first curve is a problem.) You can potentially avoid rasterisation by an appropriate computation, but it gives me a headache just to think about it.

    Since we are maximising a criterion, then this is a maximum likelihood method. Curiously, I think it is actually impossible to phrase this question as a maximum likelihood problem using normal errors, because normal errors imply L2 based criterions, which will famously not give you exact superpositions.

    like image 20
    Fhnuzoag Avatar answered Sep 19 '22 13:09

    Fhnuzoag