I have a dataframe looks like:
df <- read.table(text="chr pos Ref Alt D045313 D045314 D045135 D045136 D045137 D045138
Chr1 1462191 T C 1/1 0/1 1/1 0/0 1/1 1/1
Chr1 1463534 G C 0/0 1/1 0/0 0/1 0/0 0/0
Chr1 1463881 T A 0/1 0/0 1/1 0/0 1/1 1/1
Chr1 1464091 G A 0/0 0/0 1/1 0/0 1/1 1/1
Chr1 1464651 T C 1/1 0/0 1/1 0/1 1/1 1/1",head=F, stringsAsFactors=F)
The expected result:
chr pos Ref Alt D045313 D045314 D045135 D045136 D045137 D045138
Chr1 1462191 T C C/C T/C C/C T/T C/C C/C
Chr1 1463534 G C G/G C/C G/G G/C G/G G/G
Chr1 1463881 T A T/A T/T A/A T/T A/A A/A
Chr1 1464091 G A G/G G/G A/A G/G A/A A/A
Chr1 1464651 T C C/C T/T C/C T/C C/C C/C
the replacements would follow this: in df[5:10], "0" should be replaced by the character in df$Ref, "1" by the character in df$Alt. I checked the question in this link[Replace specific characters in a variable in data frame in R, but it didn't work on my situation. Appreciate any helps.
To replace a first or all occurrences of a single character in a string use gsub(), sub(), str_replace(), str_replace_all() and functions from dplyr package of R. gsub() and sub() are R base functions and str_replace() and str_replace_all() are from the stringr package.
To replace a column value in R use square bracket notation df[] , By using this you can update values on a single column or on all columns. To refer to a single column use df$column_name .
Use str_replace_all() method of stringr package to replace multiple string values with another list of strings on a single column in R and update part of a string with another string.
Creating data:
df <- read.table(text="chr pos Ref Alt D045313 D045314 D045135 D045136 D045137 D045138
Chr1 1462191 T C 1/1 0/1 1/1 0/0 1/1 1/1
Chr1 1463534 G C 0/0 1/1 0/0 0/1 0/0 0/0
Chr1 1463881 T A 0/1 0/0 1/1 0/0 1/1 1/1
Chr1 1464091 G A 0/0 0/0 1/1 0/0 1/1 1/1
Chr1 1464651 T C 1/1 0/0 1/1 0/1 1/1 1/1",head=T, stringsAsFactors=F)
Using gsub
:
vgsub<- Vectorize(gsub, SIMPLIFY = FALSE)
new <- vgsub("0", df$Ref, as.data.frame(t(df[5:10])))
new <- vgsub("1", df$Alt, new)
df[5:10] <- do.call("rbind", new)
df
chr pos Ref Alt D045313 D045314 D045135 D045136 D045137 D045138
1 Chr1 1462191 T C C/C T/C C/C T/T C/C C/C
2 Chr1 1463534 G C G/G C/C G/G G/C G/G G/G
3 Chr1 1463881 T A T/A T/T A/A T/T A/A A/A
4 Chr1 1464091 G A G/G G/G A/A G/G A/A A/A
5 Chr1 1464651 T C C/C T/T C/C T/C C/C C/C
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With