I have a data frame that I got from
ID <- c("A","A","A","A","B","B","B","B")
Type <- c(45,45,46,46,45,45,46,46)
Point_A <- c(10,NA,30,40,NA,80,NA,100)
Point_B <- c(NA,32,43,NA,65,11,NA,53)
df <- data.frame(ID,Type,Point_A,Point_B)
ID Type Point_A Point_B
1 A 45 10 NA
2 A 45 NA 32
3 A 46 30 43
4 A 46 40 NA
5 B 45 NA 65
6 B 45 80 11
7 B 46 NA NA
8 B 46 100 53
While I learnt from this post, I could roll up the data with ID and one column.
I am currently using sqldf to sum the rows and group by ID and Type. While this does the job for me, its very slow on a bigger dataset.
df1 <- sqldf("SELECT ID, Type, Sum(Point_A) as Point_A, Sum(Point_A) as Point_A
FROM df
GROUP BY ID, Type")
Please suggest the usage of any other techniques that would solve this problem. I have started learning dplyr & plyr packages and I find it very interesting but not knowing how to apply it here.
Desired Output
ID Type Point_A Point_B
1 A 45 10 32
2 A 46 70 43
3 B 45 80 76
4 B 46 100 53
library(data.table)
DT <- as.data.table(df)
DT[, lapply(.SD, sum, na.rm=TRUE), by=list(ID, Type)]
ID Type Point_A Point_B
1: A 45 10 32
2: A 46 70 43
3: B 45 80 76
4: B 46 100 53
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With