Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I make R use more CPU and memory? [duplicate]

Tags:

r

cpu-usage

No matter how intensive the R computation is, it doesn't use more than 25% of the CPU. I have tried setting the priority of the rsession.exe to High and even Realtime but the usage remains the same. Is there any way to increase the CPU usage of R to utilize the full potential of my system or is there is any misunderstanding in my understanding of the problem? Thanks in advance for the help.

P.S.: Below is a screenshot of the CPU usage. Screenshot of the CPU usage

like image 871
Suraj Avatar asked May 02 '15 05:05

Suraj


1 Answers

Base R is single-threaded so that 25% of usage is expected on 4-core CPU. On a single Windows machine, it is possible to spread processing across clusters (or cores if you like) using either the parallel package and the foreach package.

First of all, the parallel package (included in R 2.8.0+, no need to install) provides functions based on the snow package - these functions are extensions of lapply(). And the foreach package provides an extension of for-loop construct - note that it has to be used with the doParallel package.

Below is a quick example of k-means clustering using both the packages. The idea is simple, which is (1) fitting kmeans() in each cluster, (2) combining the outcomes and (3) seleting minimum tot.withiness.

library(parallel)
library(iterators)
library(foreach)
library(doParallel)

# parallel
split = detectCores()
eachStart = 25

cl = makeCluster(split)
init = clusterEvalQ(cl, { library(MASS); NULL })
results = parLapplyLB(cl
                      ,rep(eachStart, split)
                      ,function(nstart) kmeans(Boston, 4, nstart=nstart))
withinss = sapply(results, function(result) result$tot.withinss)
result = results[[which.min(withinss)]]
stopCluster(cl)

result$tot.withinss
#[1] 1814438

# foreach
split = detectCores()
eachStart = 25
# set up iterators
iters = iter(rep(eachStart, split))
# set up combine function
comb = function(res1, res2) {
  if(res1$tot.withinss < res2$tot.withinss) res1 else res2
}

cl = makeCluster(split)
registerDoParallel(cl)
result = foreach(nstart=iters, .combine="comb", .packages="MASS") %dopar%
  kmeans(Boston, 4, nstart=nstart)
stopCluster(cl)

result$tot.withinss
#[1] 1814438

Further details of those packages and more examples can be found in the following posts.

  • Parallel Processing on Single Machine I
  • Parallel Processing on Single Machine II
  • Parallel Processing on Single Machine III
like image 117
2 revs Avatar answered Sep 19 '22 23:09

2 revs