Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace all occurrences of a string in a data frame

Tags:

dataframe

r

I'm working on a data frame that has non-detects which are coded with '<'. Sometimes there is a space after the '<' and sometimes not e.g. '<2' or '< 2'. I'd like to remove every occurrence of the space.

Example:

data <- data.frame(name = rep(letters[1:3], each = 3), var1 = rep('< 2', 9), var2 = rep('<3', 9))    name var1 var2  1    a  < 2   <3 2    b  < 2   <3 3    c  < 2   <3 

This is where I've got to:

I can extract all the values and make the new strings but I can't put them back in the data frame.

index <- str_detect(unlist(data), '<') index <- matrix(index, nrow = 3)  data[index]  #[1] "< 2" "< 2" "< 2" "<3"  "<3"  "<3"   replacements <- str_replace_all(data[index], "<[ ]+","<")  replacements #[1] "<2" "<2" "<2" "<3" "<3" "<3"  data[index] <- replacements  #Error in `[<-.data.frame`(`*tmp*`, index, value = c("<2", "<2", "<2",  :  #  unsupported matrix index in replacement 
like image 223
Tony Ladson Avatar asked Mar 26 '15 05:03

Tony Ladson


People also ask

How do you replace all instances of a string in a DataFrame?

The only difference with the method you've highlighted is that df. replace({'\n': '<br>'}, regex=True) returns a new DataFrame object instead of updating the columns on the original DataFrame. So you'll need to reassign the output, e.g. df = df. replace({'\n': '<br>'}, regex=True) .

How do I replace a value in an entire data frame?

Pandas DataFrame replace() Method The replace() method replaces the specified value with another specified value. The replace() method searches the entire DataFrame and replaces every case of the specified value.

How do you replace all occurrences of a string in Python?

The replace() method replace() is a built-in method in Python that replaces all the occurrences of the old character with the new character.


1 Answers

If you are only looking to replace all occurrences of "< " (with space) with "<" (no space), then you can do an lapply over the data frame, with a gsub for replacement:

> data <- data.frame(lapply(data, function(x) { +                  gsub("< ", "<", x) +              })) > data   name var1 var2 1    a   <2   <3 2    a   <2   <3 3    a   <2   <3 4    b   <2   <3 5    b   <2   <3 6    b   <2   <3 7    c   <2   <3 8    c   <2   <3 9    c   <2   <3 
like image 140
Tim Biegeleisen Avatar answered Sep 23 '22 04:09

Tim Biegeleisen