Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Simplest way to do parallel replicate

Tags:

I am fond of the parallel package in R and how easy and intuitive it is to do parallel versions of apply, sapply, etc.

Is there a similar parallel function for replicate?

like image 714
bdeonovic Avatar asked Oct 09 '13 19:10

bdeonovic


People also ask

How do I create a parallel replicat?

You can create a parallel replication using the graphical user interface or the command line interfaces GGSCI and the Admin Client. Parallel Replicat is a new variant of Replicat that applies transactions in parallel to improve performance. It takes into account dependencies between transactions, similar to Integrated Replicat.

What is the difference between integrated and parallel replicat?

Integrated Replicat itself is very fast in applying the data to the target as it has parallelism concepts in it. But this Parallel Replicat process is even more faster than the Integrated Replicat process. Parallel Replicat has a highly scalable apply engine which achieves a apply rate up to 1 million+ operations per second.

What is parallel replication architecture in Oracle?

Parallel Replication Architecture Parallel replication processes leverage the apply processing functionality that is available within the Oracle Database in integrated mode. Basic Parameters for Parallel Replicat The following table lists the basic parallel Replicat parameters and their description.

What is parallel replicat process in Ogg?

From the above table, you can see a new type of replicat process has been introduced from OGG 12.3, which is call “PARALLEL REPLICAT”. Integrated Replicat itself is very fast in applying the data to the target as it has parallelism concepts in it. But this Parallel Replicat process is even more faster than the Integrated Replicat process.


2 Answers

You can just use the parallel versions of lapply or sapply, instead of saying to replicate this expression n times you do the apply on 1:n and instead of giving an expression, you wrap that expression in a function that ignores the argument sent to it.

possibly something like:

#create cluster library(parallel) cl <- makeCluster(detectCores()-1)   # get library support needed to run the code clusterEvalQ(cl,library(MASS)) # put objects in place that might be needed for the code myData <- data.frame(x=1:10, y=rnorm(10)) clusterExport(cl,c("myData")) # Set a different seed on each member of the cluster (just in case) clusterSetRNGStream(cl) #... then parallel replicate... parSapply(cl, 1:10000, function(i,...) { x <- rnorm(10); mean(x)/sd(x) } ) #stop the cluster stopCluster(cl) 

as the parallel equivalent of:

replicate(10000, {x <- rnorm(10); mean(x)/sd(x) } ) 
like image 102
Greg Snow Avatar answered Sep 21 '22 18:09

Greg Snow


Using clusterEvalQ as a model, I think I would implement a parallel replicate as:

parReplicate <- function(cl, n, expr, simplify=TRUE, USE.NAMES=TRUE)   parSapply(cl, integer(n), function(i, ex) eval(ex, envir=.GlobalEnv),             substitute(expr), simplify=simplify, USE.NAMES=USE.NAMES) 

The arguments simplify and USE.NAMES are compatible with sapply rather than replicate, but they make it a better wrapper around parSapply in my opinion.

Here's an example derived from the replicate man page:

library(parallel) cl <- makePSOCKcluster(3) hist(parReplicate(cl, 100, mean(rexp(10)))) 
like image 43
Steve Weston Avatar answered Sep 17 '22 18:09

Steve Weston