I know this code can be made much shorter and more efficient using R's powerful vector handling capabilities. I just cannot figure out how at the moment...
The basic task is to adjust cells within each row so that the row total is forced to match a predefined number, determined by another data frame. That way the total population of each area is forced to a certain value (each row represents an area), while the ratios between the cells moving from one column to the next remains the same.
Ugly way of doing it (first loop is just to create an example data frame; sure that could be done better and all; I just can't stop using loops!):
con1 <- array(dim=c(5,3))
set.seed(1066)
for(i in 1:ncol(con1)){
con1[,i] <- round(rnorm(n=5,mean=10,sd=3))}
con1 <- data.frame(con1)
con2 <- data.frame(array(c(8:13, 9:14, 10:15), dim=c(5,3)))
apply(con1,1, sum)
apply(con2,1, sum) # different row totals
con1.adj <- con1
for ( i in 1:nrow(con1)){
con1.adj[i,1] <- con1[i,1] * ( sum(con2[i,]) / sum(con1[i,]) )
con1.adj[i,2] <- con1[i,2] * ( sum(con2[i,]) / sum(con1[i,]) )
con1.adj[i,3] <- con1[i,3] * ( sum(con2[i,]) / sum(con1[i,]) )
}
con1.adj <- data.frame(con1.adj)
apply(con1.adj,1, sum) # same row totals
(context: Dug up this code up from someone else's work and used merrily for a while. It looks awful to me now that I've inched a little way up the steep R learning curve. Also want the code to be re-used by others. Really enjoying the language, and will enjoy it even more if I can find a more beautiful way of doing this)
I think this one-liner should do the job:
con1.adj <- con1 * rowSums(con2) / rowSums(con1)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With