Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

parLapply multiple arguments R

I am trying to compute evaporation by hargreaves method within the package SPEI.This involves using minimum temperature (TMIN) and maximum temperature (TMAX). Parallel computing is my best bet given that Tmin and Tmax rasterstacks have 500,000 cells and 100 layers each. Hargreaves function takes Tmin,Tmax and latitude at each grid as input. The following is my first guess how to go about this:

library(SPEI)
# go parallel 
library(parallel)
clust <- makeCluster(detectCores())

#har <- hargreaves(TMIN,TMAX,lat=37.6475) # get evaporation for a station. 

However, my data is gridded.

Tmin and Tmax are list, each dataframe in Tmin and Tmax has a $latitude attached to it. In pet, k$d is Tmin, k$d is Tmax (maybe I should provide two arguments in pet e.g. function(k,y) instead of just k?)

pet <- function(k) {
  hargreaves(k$d,k$d, k$latitude, na.rm=TRUE)}

# Make library and function available to clust
clusterEvalQ(clust, library(SPEI))
clusterExport(clust, pet)

pet_list <- parLapply(clust, TMIN,TMAX, pet)

parLapply accepts just one argument. How can I pass Tmin and Tmax to parLapply? Is it that my pet function is not correct?

Thanks.

like image 713
code123 Avatar asked Oct 30 '22 08:10

code123


1 Answers

An index could be used to reference a row in a globally defined data.frame. I give an example below.

library(SPEI)
library(parallel)

Define the test list.

Tmin <- list(aa = data.frame(a=1:30, b1=runif(30), b2=runif(30), latitude=runif(30)),
  bb = data.frame(a=1:30, b1=runif(30), b2=runif(30), latitude=runif(30)))

Tmax <- list(aa = data.frame(a=1:30, b1=runif(30), b2=runif(30), latitude=runif(30)),
  bb = data.frame(a=1:30, b1=runif(30), b2=runif(30), latitude=runif(30)))

Make the cluster

clust <- makeCluster(2)

This is the re-written function, but we'll test it out on a simpler function.

pet1 <- function(ind){
  Tmin[[ind]]$a + Tmax[[ind]]$a
}

Call the SPEI library and send everything in the workspace to each CPU. This is normally not great form, so forgive me.

clusterEvalQ(clust, library(SPEI))
clusterExport(clust, ls())

Run the parLapply function

pet_test <- parLapply(clust, 1:length(Tmin), pet1)

edit: Edited to account for Tmin and Tmax being lists. The core idea is the same, which is to use an index as the one argument to the pet function, and to reference a global variable from within pet.

like image 100
Tad Dallas Avatar answered Nov 15 '22 06:11

Tad Dallas