Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to replace specific characters in a data frame by the value in a variable in r

Tags:

r

I have a dataframe looks like:

df <- read.table(text="chr     pos Ref Alt D045313 D045314 D045135 D045136 D045137 D045138
Chr1 1462191   T   C     1/1     0/1     1/1     0/0     1/1     1/1
Chr1 1463534   G   C     0/0     1/1     0/0     0/1     0/0     0/0
Chr1 1463881   T   A     0/1     0/0     1/1     0/0     1/1     1/1
Chr1 1464091   G   A     0/0     0/0     1/1     0/0     1/1     1/1
Chr1 1464651   T   C     1/1     0/0     1/1     0/1    1/1     1/1",head=F, stringsAsFactors=F)

The expected result:

  chr     pos Ref Alt D045313 D045314 D045135 D045136 D045137 D045138
Chr1 1462191   T   C     C/C     T/C     C/C     T/T     C/C     C/C
Chr1 1463534   G   C     G/G     C/C     G/G     G/C     G/G     G/G
Chr1 1463881   T   A     T/A     T/T     A/A     T/T     A/A     A/A
Chr1 1464091   G   A     G/G     G/G     A/A     G/G     A/A     A/A
Chr1 1464651   T   C     C/C     T/T     C/C     T/C    C/C     C/C

the replacements would follow this: in df[5:10], "0" should be replaced by the character in df$Ref, "1" by the character in df$Alt. I checked the question in this link[Replace specific characters in a variable in data frame in R, but it didn't work on my situation. Appreciate any helps.

like image 276
user3354212 Avatar asked Aug 04 '15 14:08

user3354212


People also ask

How do I replace a character in a variable in R?

To replace a first or all occurrences of a single character in a string use gsub(), sub(), str_replace(), str_replace_all() and functions from dplyr package of R. gsub() and sub() are R base functions and str_replace() and str_replace_all() are from the stringr package.

How do I replace specific values in R?

To replace a column value in R use square bracket notation df[] , By using this you can update values on a single column or on all columns. To refer to a single column use df$column_name .

How do I replace a string in a value in R?

Use str_replace_all() method of stringr package to replace multiple string values with another list of strings on a single column in R and update part of a string with another string.


1 Answers

Creating data:

df <- read.table(text="chr     pos Ref Alt D045313 D045314 D045135 D045136 D045137 D045138
                 Chr1 1462191   T   C     1/1     0/1     1/1     0/0     1/1     1/1
                 Chr1 1463534   G   C     0/0     1/1     0/0     0/1     0/0     0/0
                 Chr1 1463881   T   A     0/1     0/0     1/1     0/0     1/1     1/1
                 Chr1 1464091   G   A     0/0     0/0     1/1     0/0     1/1     1/1
                 Chr1 1464651   T   C     1/1     0/0     1/1     0/1    1/1     1/1",head=T, stringsAsFactors=F)

Using gsub:

vgsub<- Vectorize(gsub, SIMPLIFY = FALSE)
new <- vgsub("0", df$Ref, as.data.frame(t(df[5:10])))
new <- vgsub("1", df$Alt, new)
df[5:10] <- do.call("rbind", new)
df
  chr     pos Ref Alt D045313 D045314 D045135 D045136 D045137 D045138
1 Chr1 1462191   T   C     C/C     T/C     C/C     T/T     C/C     C/C
2 Chr1 1463534   G   C     G/G     C/C     G/G     G/C     G/G     G/G
3 Chr1 1463881   T   A     T/A     T/T     A/A     T/T     A/A     A/A
4 Chr1 1464091   G   A     G/G     G/G     A/A     G/G     A/A     A/A
5 Chr1 1464651   T   C     C/C     T/T     C/C     T/C     C/C     C/C
like image 177
Carlos Cinelli Avatar answered Nov 14 '22 13:11

Carlos Cinelli