Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to save the coefficients for each optim iteration?

Tags:

r

This question is related to Applying Cost Functions in R

I would like to know how to save the coefficients generated for each iteration of optim. trace=TRUE enables me to get the coefficients for each iteration printed, but how can I save them?

Example code:

set.seed(1)
X <- matrix(rnorm(1000), ncol=10) # some random data
Y <- sample(0:1, 100, replace=TRUE)

# Implement Sigmoid function
sigmoid <- function(z) {
  g <- 1/(1+exp(-z))
  return(g)
}

cost.glm <- function(theta,X) {
  m <- nrow(X)
  g <- sigmoid(X%*%theta)
  (1/m)*sum((-Y*log(g)) - ((1-Y)*log(1-g)))
}

X1 <- cbind(1, X)

df <- optim(par=rep(0,ncol(X1)), fn = cost.glm, method='CG',
            X=X1, control=list(trace=TRUE))

Which outputs:

 Conjugate gradients function minimizer
Method: Fletcher Reeves
tolerance used in gradient test=2.00089e-11
0 1 0.693147
parameters    0.00000    0.00000    0.00000    0.00000    0.00000    0.00000    0.00000 
   0.00000    0.00000    0.00000    0.00000 
 i> 1 3 0.662066
parameters   -0.01000   -0.01601   -0.06087    0.14891    0.04123    0.03835   -0.01898 
   0.00637    0.02954   -0.01423   -0.07544 
 i> 2 5 0.638548
parameters   -0.02366   -0.03733   -0.13803    0.32782    0.09034    0.08082   -0.03978 
   0.01226    0.07120   -0.02925   -0.16042 
 i> 3 7 0.630501
parameters   -0.03478   -0.05371   -0.19149    0.43890    0.11960    0.10236   -0.04935 
   0.01319    0.10648   -0.03565   -0.20408 
 i> 4 9 0.627570.......

And dfdoes not contain any information on the coefficients, but only displays the final coefficients and the final cost:

 str(df)
List of 5
 $ par        : num [1:11] -0.0679 -0.1024 -0.2951 0.6162 0.124 ...
 $ value      : num 0.626
 $ counts     : Named int [1:2] 53 28
  ..- attr(*, "names")= chr [1:2] "function" "gradient"
 $ convergence: int 0
 $ message    : NULL
like image 444
nadizan Avatar asked Mar 06 '23 08:03

nadizan


2 Answers

## use `capture.output` to get raw output
out <- capture.output(df <- optim(par=rep(0,ncol(X1)), fn = cost.glm, method='CG',
                                  X=X1, control=list(trace=TRUE)))
## lines that contain parameters
start <- grep("parameters", out)
param_line <- outer(seq_len(start[2] - start[1] - 1) - 1, start, "+")
## parameter message
param_msg <- gsub("parameters", "", out[param_line])
## parameter matrix (a row per iteration)
param <- matrix(scan(text = param_msg), ncol = length(df$par), byrow = TRUE)

## inspect output (rounded to 2-digits for compact display)

head(round(param, 2))
#       [,1]  [,2]  [,3] [,4] [,5] [,6]  [,7]  [,8] [,9] [,10] [,11]
# [1,]  0.00  0.00  0.00 0.00 0.00 0.00  0.00  0.00 0.00  0.00  0.00
# [2,] -0.01 -0.02 -0.06 0.15 0.04 0.04 -0.02  0.01 0.03 -0.01 -0.08
# [3,] -0.02 -0.04 -0.14 0.33 0.09 0.08 -0.04  0.01 0.07 -0.03 -0.16
# [4,] -0.03 -0.05 -0.19 0.44 0.12 0.10 -0.05  0.01 0.11 -0.04 -0.20
# [5,] -0.04 -0.07 -0.23 0.51 0.14 0.11 -0.05  0.01 0.14 -0.04 -0.22
# [6,] -0.05 -0.08 -0.25 0.55 0.14 0.12 -0.05  0.01 0.16 -0.04 -0.23

tail(round(param, 2))
#[23,] -0.07 -0.10 -0.30 0.62 0.12 0.13 -0.03 -0.01 0.21 -0.04 -0.21
#[24,] -0.07 -0.10 -0.30 0.62 0.12 0.13 -0.03 -0.01 0.21 -0.04 -0.21
#[25,] -0.07 -0.10 -0.30 0.62 0.12 0.13 -0.03 -0.01 0.21 -0.04 -0.21
#[26,] -0.07 -0.10 -0.30 0.62 0.12 0.13 -0.03 -0.01 0.21 -0.04 -0.21
#[27,] -0.07 -0.10 -0.30 0.62 0.12 0.13 -0.03 -0.01 0.21 -0.04 -0.21
#[28,] -0.07 -0.10 -0.30 0.62 0.12 0.13 -0.03 -0.01 0.21 -0.04 -0.21

## one way to visualize the search steps
matplot(param, type = "l", lty = 1, xlab = "iterations")

like image 127
Zheyuan Li Avatar answered Mar 14 '23 20:03

Zheyuan Li


So, the other solution works... but involves parsing the response inside the trace. Here's an approach that gives you access to the objects directly. (And would be general in any other optimization function that doesn't allow you to easily show a text trace):

(This works because you can assign inside of an environment from within a function)

EDIT: This adds another row every time cost.glm is run, not just every time the trace is evaluated.

Also added the conversion to the matrix format used by the other solution.

set.seed(1)
X <- matrix(rnorm(1000), ncol=10) # some random data
Y <- sample(0:1, 100, replace=TRUE)


# Implement Sigmoid function
sigmoid <- function(z) {
  g <- 1/(1+exp(-z))
  return(g)
}

# Create environment to store output
# We could also use .GlobalEnv
params_env <- new.env()

# Initialize parameters object
params_env$optim_run <- list()

cost.glm <- function(theta,X) {
  # Extend the list by 1 and insert theta inside the given environment
  # This can be done more efficiently by
  # extending several at a time, but that's easy to add.
  n <- length(params_env[['optim_run']])
  params_env[['optim_run']][[n + 1]] <- theta

  m <- nrow(X)
  g <- sigmoid(X%*%theta)
  (1/m)*sum((-Y*log(g)) - ((1-Y)*log(1-g)))
}

X1 <- cbind(1, X)

df <- optim(par=rep(0,ncol(X1)), fn = cost.glm, method='CG',
            X=X1, control=list(trace=TRUE))

# View list of all param values
print(params_env$optim_run)

# Return as same format as other solution
param <- do.call(rbind, params_env[['optim_run']])

matplot(param, type = "l", lty = 1, xlab = "iterations")

As we can see, there are more jumps in this set as cost.glm explores the space

like image 22
Shape Avatar answered Mar 14 '23 18:03

Shape