Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

I want to express this code with for loop or function

Tags:

r

I have a large data frame.

As you can see, a pattern exists code below:

data_1<-data_1
data_2<-data_2 %>% filter(rowSums(data_2[,1:1])==0)
data_3<-data_3 %>% filter(rowSums(data_3[,1:2])==0)
data_4<-data_4 %>% filter(rowSums(data_4[,1:3])==0)
data_5<-data_5 %>% filter(rowSums(data_5[,1:4])==0)
data_6<-data_6 %>% filter(rowSums(data_6[,1:5])==0)
data_7<-data_7 %>% filter(rowSums(data_7[,1:6])==0)
data_8<-data_8 %>% filter(rowSums(data_8[,1:7])==0)
data_9<-data_9 %>% filter(rowSums(data_9[,1:8])==0)
data_10<-data_10 %>% filter(rowSums(data_10[,1:9])==0)
data_11<-data_11 %>% filter(rowSums(data_11[,1:10])==0)
data_12<-data_12 %>% filter(rowSums(data_12[,1:11])==0)
data_13<-data_13 %>% filter(rowSums(data_13[,1:12])==0)
data_14<-data_14 %>% filter(rowSums(data_14[,1:13])==0)
data_15<-data_15 %>% filter(rowSums(data_15[,1:14])==0)
data_16<-data_16 %>% filter(rowSums(data_16[,1:15])==0)
data_17<-data_17 %>% filter(rowSums(data_17[,1:16])==0)
data_18<-data_18 %>% filter(rowSums(data_18[,1:17])==0)
data_19<-data_19 %>% filter(rowSums(data_19[,1:18])==0)
data_20<-data_20 %>% filter(rowSums(data_20[,1:19])==0)
data_21<-data_21 %>% filter(rowSums(data_21[,1:20])==0)

I tried to make loop like this

for(i in 1:21){
  data_i <- data_i %>% filter(rowSums(data_i[,1:i-1])==0)

but, data_i is far away from my intention.

how do I solve this problem?

like image 992
신유철 Avatar asked Jun 25 '26 18:06

신유철


1 Answers

1) for We use the test data in the Note at the end based on the built in anscombe data frame that comes with R. It is best to keep related data frames in a list so we first create such a list L and then iterate over it producing a new list L2 so that we don't overwrite the original list. Keeping the input and output separate makes it easier to debug.

We could alternately write seq_along(L)[-1] as seq(2, length(L)) and we could alternately write seq_len(i-1) as seq(1, i-1). Note that if DF is a data frame then DF[, 1] is the first column as a column vector but DF[, 1, drop = FALSE] is a one column data frame.

No packages are used.

L <- mget(ls(pattern = "^data_\\d+$"))
L2 <- L
for(i in seq_along(L)[-1]) {
  Li <- L[[i]]
  Sum <- rowSums(Li[, seq_len(i-1), drop = FALSE])
  L2[[i]] <- Li[Sum == 0, ]
} 

2) lapply Alternately we could use lapply:

L <- mget(ls(pattern = "^data_\\d+$"))
L2 <- L
L2[-1] <- lapply(seq_along(L)[-1], function(i) {
  Li <- L[[i]]
  Sum <- rowSums(Li[, seq_len(i-1), drop = FALSE])
  Li[Sum == 0, ]
})

3) Map or use Map

L3 <- L
f3 <- function(d, i) {
  Sum <- rowSums(d[, seq_len(i-1), drop = FALSE])
  d[Sum == 0, ]
}
L3[-1] <- Map(f3, L[-1], seq_along(L)[-1])

or special case the first element like this. Note that it will take the component names from the first argument to Map after the function so it is important that f4 be defined so that that argument is L.

f4 <- function(d, i) {
  if (i == 1) d 
  else {
    Sum <- rowSums(d[, seq_len(i-1), drop = FALSE])
    d[Sum == 0, ]
  }
}
L4 <- Map(f4, L, seq_along(L))

Note

# create test data
data_1 <- anscombe
data_1[1, 1] <- 0
data_2 <- 10 * anscombe
data_2[2, 1:2] <- 0
data_3 <- 100 * anscombe
data_3[3, 1:3] <- 0
like image 54
G. Grothendieck Avatar answered Jun 27 '26 09:06

G. Grothendieck