Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Loop through each column and row, do stuff

Tags:

r

dplyr

tidyr

I think this is the best way to describe what I want to do:

df$column <- ifelse(is.na(df$column) == TRUE, 0, 1)

But where column is dynamic. This is because I have about 45 columns all with the same kind of content, and all I want to do is check each cell, replace it with a 1 if there's something in it, a 0 if not. I have of course tried many different things, but since there seems to be no df[index][column] in R, I'm lost. I'd have expected something like this to work, but nope:

for (index in df) {
  for (column in names(df)) {
    df[[index]][[column]] <- ifelse(is.na(df[[index]][[column]]) == TRUE, 0, 1)
  }
}

I could do this quickly in other languages (or even Excel), but I'm just learning R and want to understand why something so simple seems to be so complicated in a language that's meant to work with data. Thanks!

like image 415
Keith Collins Avatar asked Mar 16 '23 03:03

Keith Collins


1 Answers

How about this:

df.new = as.data.frame(lapply(df, function(x) ifelse(is.na(x), 0, 1)))

lapply applies a function to each column of the data frame df. In this case, the function does the 0/1 replacement. lapply returns a list. Wrapping it in as.data.frame converts the list to a data frame (which is a special type of list).

In R you can often replace a loop with one of the *apply family of functions. In this case, lapply "loops" over the columns of the data frame. Also, many R functions are "vectorized" meaning the function operates on every value in a vector at once. In this case, ifelse does the replacement on an entire column of the data frame.

like image 157
eipi10 Avatar answered Mar 31 '23 19:03

eipi10