Here is small example:
X1 <- c("AC", "AC", "AC", "CA", "TA", "AT", "CC", "CC")
X2 <- c("AC", "AC", "AC", "CA", "AT", "CA", "AC", "TC")
X3 <- c("AC", "AC", "AC", "AC", "AA", "AT", "CC", "CA")
mydf1 <- data.frame(X1, X2, X3)
Input data frame
X1 X2 X3
1 AC AC AC
2 AC AC AC
3 AC AC AC
4 CA CA AC
5 TA AT AA
6 AT CA AT
7 CC AC CC
8 CC TC CA
The function
# Function
atgc <- function(x) {
xlate <- c( "AA" = 11, "AC" = 12, "AG" = 13, "AT" = 14,
"CA"= 12, "CC" = 22, "CG"= 23,"CT"= 24,
"GA" = 13, "GC" = 23, "GG"= 33,"GT"= 34,
"TA"= 14, "TC" = 24, "TG"= 34,"TT"=44,
"ID"= 56, "DI"= 56, "DD"= 55, "II"= 66
)
x = xlate[x]
}
outdataframe <- sapply (mydf1, atgc)
outdataframe
X1 X2 X3
AA 11 11 12
AA 11 11 12
AA 11 11 12
AG 13 13 12
CA 12 12 11
AC 12 13 13
AT 14 11 12
AT 14 14 14
Problem, AC is not eaqual to 12 in output rather 11, similarly for others. Just mess !
( Exta: Also I do not know how to get rid of the rownames.)
Just use apply
and transpose:
t(apply (mydf1, 1, atgc))
To use sapply
, then either use:
stringsAsFactors=FALSE
when creating your data frame, i.e.
mydf1 <- data.frame(X1, X2, X3, stringsAsFactors=FALSE)
(thanks @joran) or
Change the last line of your function to: x = xlate[as.vector(x)]
The `match function can use factor arguments with a target matching vector that is "character" class:
atgc <- function(fac){ c(11, 12, 13, 14,
12, 22, 23, 24,
13, 23, 33, 34,
14, 24, 34,44,
56, 56, 55, 66 )[
match(fac,
c("AA", "AC", "AG", "AT",
"CA", "CC", "CG","CT",
"GA", "GC", "GG","GT" ,
"TA", "TC", "TG","TT",
"ID", "DI", "DD", "II") )
]}
#The match function returns an index that is designed to pull from a vector.
sapply(mydf1, atgc)
X1 X2 X3
[1,] 12 12 12
[2,] 12 12 12
[3,] 12 12 12
[4,] 12 12 12
[5,] 14 14 11
[6,] 14 12 14
[7,] 22 12 22
[8,] 22 24 12
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With