I have the following example set of data:
Example<-data.frame(A=10*1:9,B=10*10:18)
rownames(Example)<-paste("Sample",1:9)
> Example
A B
Sample 1 10 100
Sample 2 20 110
Sample 3 30 120
Sample 4 40 130
Sample 5 50 140
Sample 6 60 150
Sample 7 70 160
Sample 8 80 170
Sample 9 90 180
I am trying to divide each element in both columns by its column's total. I have tried a variety of methods, but I feel like I am missing a fundamental piece of code that would make this easier. I have gotten this far:
ExampleSum1 <- sum(Example[,1])
ExampleSum2 <- sum(Example[,2])
But I don't know how to divide 10, 20, 30, etc by ExampleSum1
, etc.
data.table
solution:
sum.cols = c("A", "B")
library(data.table)
setDT(Example, keep.rownames = TRUE)
Example[ , (sum.cols) := lapply(.SD, function(x) x/sum(x)), .SDcols = sum.cols]
Or perhaps more direct in your case:
Example[ , c("A", "B") := .(A/sum(A), B/sum(B))]
Which give:
Example
# rn A B
# 1: Sample 1 0.02222222 0.07936508
# 2: Sample 2 0.04444444 0.08730159
# 3: Sample 3 0.06666667 0.09523810
# 4: Sample 4 0.08888889 0.10317460
# 5: Sample 5 0.11111111 0.11111111
# 6: Sample 6 0.13333333 0.11904762
# 7: Sample 7 0.15555556 0.12698413
# 8: Sample 8 0.17777778 0.13492063
# 9: Sample 9 0.20000000 0.14285714
The main appeal of this approach as opposed to one using colSums
or sweep
is that both of these require converting your data to a matrix and then back, which may be costly. It depends on your use case; if your table is small, these other approaches are fine and it depends on what you find most readable.
I also notice that no other answers mention the mapply
approach, which would work in almost any paradigm; here's the data.table
approach:
Example[ , (sum.cols) := mapply(`/`, .SD, lapply(.SD, sum), SIMPLIFY = FALSE),
.SDcols = sum.cols]
You can get column sums with colSums
and paste
to make new column names derived from the previous. colSums
returns a vector of the column sums, but to do column-wise division you need to use a little trickery. The best way looks to be the one mentioned @user20650.
## Make new columns: proportions of column sums
dat[,paste(names(dat),"prop", sep="_")] <- t( t(dat) / colSums(dat) )
dat
# A B A_prop B_prop
# Sample1 10 100 0.02222222 0.07936508
# Sample2 20 110 0.04444444 0.08730159
# Sample3 30 120 0.06666667 0.09523810
# Sample4 40 130 0.08888889 0.10317460
# Sample5 50 140 0.11111111 0.11111111
# Sample6 60 150 0.13333333 0.11904762
# Sample7 70 160 0.15555556 0.12698413
# Sample8 80 170 0.17777778 0.13492063
# Sample9 90 180 0.20000000 0.14285714
Data
dat <- read.table(text="A B
Sample1 10 100
Sample2 20 110
Sample3 30 120
Sample4 40 130
Sample5 50 140
Sample6 60 150
Sample7 70 160
Sample8 80 170
Sample9 90 180", header=T)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With