Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fast nonnegative quantile and Huber regression in R

I am looking for a fast way to do nonnegative quantile and Huber regression in R (i.e. with the constraint that all coefficients are >0). I tried using the CVXR package for quantile & Huber regression and the quantreg package for quantile regression, but CVXR is very slow and quantreg seems buggy when I use nonnegativity constraints. Does anybody know of a good and fast solution in R, e.g. using the Rcplex package or R gurobi API, thereby using the faster CPLEX or gurobi optimizers?

Note that I need to run a problem size like below 80 000 times, whereby I only need to update the y vector in each iteration, but still use the same predictor matrix X. In that sense, I feel it's inefficient that in CVXR I now have to do obj <- sum(quant_loss(y - X %*% beta, tau=0.01)); prob <- Problem(Minimize(obj), constraints = list(beta >= 0)) within each iteration, when the problem is in fact staying the same and all I want to update is y. Any thoughts to do all this better/faster?

Minimal example:

## Generate problem data
n <- 7 # n predictor vars
m <- 518 # n cases 
set.seed(1289)
beta_true <- 5 * matrix(stats::rnorm(n), nrow = n)+20
X <- matrix(stats::rnorm(m * n), nrow = m, ncol = n)
y_true <- X %*% beta_true
eps <- matrix(stats::rnorm(m), nrow = m)
y <- y_true + eps

Nonnegative quantile regression using CVXR :

## Solve nonnegative quantile regression problem using CVX
require(CVXR)
beta <- Variable(n)
quant_loss <- function(u, tau) { 0.5*abs(u) + (tau - 0.5)*u }
obj <- sum(quant_loss(y - X %*% beta, tau=0.01))
prob <- Problem(Minimize(obj), constraints = list(beta >= 0))
system.time(beta_cvx <- pmax(solve(prob, solver="SCS")$getValue(beta), 0)) # estimated coefficients, note that they ocasionally can go - though and I had to clip at 0
# 0.47s
cor(beta_true,beta_cvx) # correlation=0.99985, OK but very slow

Syntax for nonnegative Huber regression is the same but would use

M <- 1      ## Huber threshold
obj <- sum(CVXR::huber(y - X %*% beta, M))

Nonnegative quantile regression using quantreg package :

### Solve nonnegative quantile regression problem using quantreg package with method="fnc"
require(quantreg)
R <- rbind(diag(n),-diag(n))
r <- c(rep(0,n),-rep(1E10,n)) # specify bounds of coefficients, I want them to be nonnegative, and 1E10 should ideally be Inf
system.time(beta_rq <- coef(rq(y~0+X, R=R, r=r, tau=0.5, method="fnc"))) # estimated coefficients
# 0.12s
cor(beta_true,beta_rq) # correlation=-0.477, no good, and even worse with tau=0.01...
like image 630
Tom Wenseleers Avatar asked Nov 07 '22 12:11

Tom Wenseleers


1 Answers

To speed up CVXR, you can get the problem data once in the beginning, then modify it within a loop and pass it directly to the solver's R interface. The code for this is

prob_data <- get_problem_data(prob, solver = "SCS")

Then, parse out the arguments and pass them to scs from the scs library. (See Solver.solve in solver.R). You'll have to dig into the details of the canonicalization, but I expect if you're just changing y at each iteration, it should be a straightforward modification.

like image 165
anqif Avatar answered Nov 15 '22 06:11

anqif