Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: How to replace elements of a data.frame?

I'm trying to replace elements of a data.frame containing "#N/A" with "NULL", and I'm running into problems:

foo <- data.frame("day"= c(1, 3, 5, 7), "od" = c(0.1, "#N/A", 0.4, 0.8))

indices_of_NAs <- which(foo == "#N/A") 

replace(foo, indices_of_NAs, "NULL")

Error in [<-.data.frame(*tmp*, list, value = "NULL") : new columns would leave holes after existing columns

I think that the problem is that my index is treating the data.frame as a vector, but that the replace function is treating it differently somehow, but I'm not sure what the issue is?

like image 745
John Avatar asked May 04 '10 16:05

John


People also ask

How do I replace an element in R?

replace() function in R Language is used to replace the values in the specified string vector x with indices given in list by those given in values. It takes on three parameters first is the list name, then the index at which the element needs to be replaced, and the third parameter is the replacement values.

How do I replace a value in a DataFrame column in R?

To replace a column value in R use square bracket notation df[] , By using this you can update values on a single column or on all columns. To refer to a single column use df$column_name .

How do I replace a value in an entire data frame?

The replace() method replaces the specified value with another specified value. The replace() method searches the entire DataFrame and replaces every case of the specified value.

How do you replace values in a DataFrame in R dplyr?

Use mutate() and its other verbs mutate_all() , mutate_if() and mutate_at() from dplyr package to replace/update the values of the column (string, integer, or any type) in R DataFrame (data. frame).


2 Answers

NULL really means "nothing", not "missing" so it cannot take the place of an actual value - for missing R uses NA.

You can use the replacement method of is.na to directly update the selected elements, this will work with a logical result. (Using which for indices will only work with is.na, direct use of [ invokes list access, which is the cause of your error).

foo <- data.frame("day"= c(1, 3, 5, 7), "od" = c(0.1, "#N/A", 0.4, 0.8)) 
NAs <- foo == "#N/A"

## by replace method
is.na(foo)[NAs] <- TRUE

 ## or directly
 foo[NAs] <- NA

But, you are already dealing with strings (actually a factor by default) in your od column by forced coercion when it was created with c(), and you might need to treat columns individually. Any numeric column will never have a match on the string "#N/A", for example.

like image 54
mdsumner Avatar answered Oct 11 '22 11:10

mdsumner


Why not

x$col[is.na(x$col)]<-value

?
You wont have to change your dataframe

like image 42
Aashu Avatar answered Oct 11 '22 10:10

Aashu