Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Minimization with constraint on all parameters in R

Tags:

r

I want to minimize a simple linear function Y = x1 + x2 + x3 + x4 + x5 using ordinary least squares with the constraint that the sum of all coefficients have to equal 5. How can I accomplish this in R? All of the packages I've seen seem to allow for constraints on individual coefficients, but I can't figure out how to set a single constraint affecting coefficients. I'm not tied to OLS; if this requires an iterative approach, that's fine as well.

like image 872
eykanal Avatar asked Feb 21 '23 22:02

eykanal


2 Answers

The basic math is as follows: we start with

mu = a0 + a1*x1 + a2*x2 + a3*x3 + a4*x4

and we want to find a0-a4 to minimize the SSQ between mu and our response variable y.

if we replace the last parameter (say a4) with (say) C-a1-a2-a3 to honour the constraint, we end up with a new set of linear equations

mu = a0 + a1*x1 + a2*x2 + a3*x3 + (C-a1-a2-a3)*x4
   = a0 + a1*(x1-x4) + a2*(x2-x4) + a3*(x3-x4) + C*x4

(note that a4 has disappeared ...)

Something like this (untested!) implements it in R.

  1. Original data frame:

    d <- data.frame(y=runif(20),
                    x1=runif(20),
                    x2=runif(20),
                    x3=runif(20),
                    x4=runif(20))
    
  2. Create a transformed version where all but the last column have the last column "swept out", e.g. x1 -> x1-x4; x2 -> x2-x4; ...

    dtrans <- data.frame(y=d$y,
                         sweep(d[,2:4],
                               1,
                               d[,5],
                               "-"),
                         x4=d$x4)
    
  3. Rename to tx1, tx2, ... to minimize confusion:

    names(dtrans)[2:4] <- paste("t",names(dtrans[2:4]),sep="")
    
  4. Sum-of-coefficients constraint:

    constr <- 5  
    
  5. Now fit the model with an offset:

    lm(y~tx1+tx2+tx3,offset=constr*x4,data=dtrans)
    

It wouldn't be too hard to make this more general.

This requires a little more thought and manipulation than simply specifying a constraint to a canned optimization program. On the other hand, (1) it could easily be wrapped in a convenience function; (2) it's much more efficient than calling a general-purpose optimizer, since the problem is still linear (and in fact one dimension smaller than the one you started with). It could even be done with big data (e.g. biglm). (Actually, it occurs to me that if this is a linear model, you don't even need the offset, although using the offset means you don't have to compute a0=intercept-C*x4 after you finish.)

like image 134
Ben Bolker Avatar answered Feb 23 '23 14:02

Ben Bolker


Since you said you are open to other approaches, this can also be solved in terms of a quadratic programming (QP):

Minimize a quadratic objective: the sum of the squared errors,

subject to a linear constraint: your weights must sum to 5.

Assuming X is your n-by-5 matrix and Y is a vector of length(n), this would solve for your optimal weights:

library(limSolve)
lsei(A = X,
     B = Y,
     E = matrix(1, nrow = 1, ncol = 5),
     F = 5)
like image 21
flodel Avatar answered Feb 23 '23 16:02

flodel