I have a large data frame.
As you can see, a pattern exists code below:
data_1<-data_1
data_2<-data_2 %>% filter(rowSums(data_2[,1:1])==0)
data_3<-data_3 %>% filter(rowSums(data_3[,1:2])==0)
data_4<-data_4 %>% filter(rowSums(data_4[,1:3])==0)
data_5<-data_5 %>% filter(rowSums(data_5[,1:4])==0)
data_6<-data_6 %>% filter(rowSums(data_6[,1:5])==0)
data_7<-data_7 %>% filter(rowSums(data_7[,1:6])==0)
data_8<-data_8 %>% filter(rowSums(data_8[,1:7])==0)
data_9<-data_9 %>% filter(rowSums(data_9[,1:8])==0)
data_10<-data_10 %>% filter(rowSums(data_10[,1:9])==0)
data_11<-data_11 %>% filter(rowSums(data_11[,1:10])==0)
data_12<-data_12 %>% filter(rowSums(data_12[,1:11])==0)
data_13<-data_13 %>% filter(rowSums(data_13[,1:12])==0)
data_14<-data_14 %>% filter(rowSums(data_14[,1:13])==0)
data_15<-data_15 %>% filter(rowSums(data_15[,1:14])==0)
data_16<-data_16 %>% filter(rowSums(data_16[,1:15])==0)
data_17<-data_17 %>% filter(rowSums(data_17[,1:16])==0)
data_18<-data_18 %>% filter(rowSums(data_18[,1:17])==0)
data_19<-data_19 %>% filter(rowSums(data_19[,1:18])==0)
data_20<-data_20 %>% filter(rowSums(data_20[,1:19])==0)
data_21<-data_21 %>% filter(rowSums(data_21[,1:20])==0)
I tried to make loop like this
for(i in 1:21){
data_i <- data_i %>% filter(rowSums(data_i[,1:i-1])==0)
but, data_i is far away from my intention.
how do I solve this problem?
1) for We use the test data in the Note at the end based on the built in anscombe data frame that comes with R. It is best to keep related data frames in a list so we first create such a list L and then iterate over it producing a new list L2 so that we don't overwrite the original list. Keeping the input and output separate makes it easier to debug.
We could alternately write seq_along(L)[-1] as seq(2, length(L)) and we could alternately write seq_len(i-1) as seq(1, i-1). Note that if DF is a data frame then DF[, 1] is the first column as a column vector but DF[, 1, drop = FALSE] is a one column data frame.
No packages are used.
L <- mget(ls(pattern = "^data_\\d+$"))
L2 <- L
for(i in seq_along(L)[-1]) {
Li <- L[[i]]
Sum <- rowSums(Li[, seq_len(i-1), drop = FALSE])
L2[[i]] <- Li[Sum == 0, ]
}
2) lapply Alternately we could use lapply:
L <- mget(ls(pattern = "^data_\\d+$"))
L2 <- L
L2[-1] <- lapply(seq_along(L)[-1], function(i) {
Li <- L[[i]]
Sum <- rowSums(Li[, seq_len(i-1), drop = FALSE])
Li[Sum == 0, ]
})
3) Map or use Map
L3 <- L
f3 <- function(d, i) {
Sum <- rowSums(d[, seq_len(i-1), drop = FALSE])
d[Sum == 0, ]
}
L3[-1] <- Map(f3, L[-1], seq_along(L)[-1])
or special case the first element like this. Note that it will take the component names from the first argument to Map after the function so it is important that f4 be defined so that that argument is L.
f4 <- function(d, i) {
if (i == 1) d
else {
Sum <- rowSums(d[, seq_len(i-1), drop = FALSE])
d[Sum == 0, ]
}
}
L4 <- Map(f4, L, seq_along(L))
# create test data
data_1 <- anscombe
data_1[1, 1] <- 0
data_2 <- 10 * anscombe
data_2[2, 1:2] <- 0
data_3 <- 100 * anscombe
data_3[3, 1:3] <- 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With