I Have a tab delim file with 400 columns.Now I want to append text to the column names.ie if there is column name is A and B,I want it to change A to A.ovca and B to B.ctrls.Like wise I want to add the texts(ovca and ctrls) to 400 coulmns.Some column names with ovca and some with ctrls.All the columns are unique and contains more than 1000 rows.A sample code of the delim file is given below:
X Y Z A B C
2.34 .89 1.4 .92 9.40 .82
6.45 .04 2.55 .14 1.55 .04
1.09 .91 4.19 .16 3.19 .56
5.87 .70 3.47 .80 2.47 .90
And i want the file to be look like:
X.ovca Y.ctrls Z.ctrls A.ovca B.ctlrs C.ovca
2.34 .89 1.4 .92 9.40 .82
6.45 .04 2.55 .14 1.55 .04
1.09 .91 4.19 .16 3.19 .56
5.87 .70 3.47 .80 2.47 .90
Please do help me
Regards Thileepan
If you data.frame is called dat
, you can access (and write to) the column names with colnames(dat)
.
Therefore:
cn <- colnames(dat)
cn <- sub("([AXC])","\\1.ovca",cn)
cn <- sub("([YZB])","\\1.ctrls",cn)
colnames(dat) <- cn
> cn
[1] "X.ovca" "Y.ctrls" "Z.ctrls" "A.ovca" "B.ctrls" "C.ovca"
The \\1
is called back-substitution within your regular expression. It will replace \\1
with whatever's inside the parentheses in the pattern. Since inside the parentheses you have a bracket, it will match any of the letters inside. In this case, "A" becomes "A.ovca" and "X" becomes "X.ovca".
If your variable names are more than one letter, easy enough to extend; just look up a bit on regex's.
Here is a two liner using the stringr
package.
nam <- names(mydf)
names(mydf) <- ifelse(nam %in% c('X', 'A', 'Z'),
str_c(nam, '.ovca'), str_c(nam, '.ctrls'))
How about this? You basically find columns that you want to append "ovca" and "ctrls" using %in%
, and append the appropriate tag.
> (mydf <- data.frame(X = runif(10), Y = runif(10), Z = runif(10), A = runif(10), B = runif(10), C = runif(10)))
X Y Z A B C
1 0.81030594 0.1624974 0.3977381 0.9619541 0.9866498 0.4424760
2 0.92498687 0.2069429 0.6065115 0.9969835 0.2407364 0.2455184
3 0.11033869 0.2878640 0.5662793 0.7936232 0.6066735 0.8210634
> names(mydf)[names(mydf) %in% c("X", "A", "C")] <- paste(names(mydf)[names(mydf) %in% c("X", "A", "C")], "ovca", sep = ".")
> names(mydf)[names(mydf) %in% c("Y", "Z", "B")] <- paste(names(mydf)[names(mydf) %in% c("Y", "Z", "B")], "ctrls", sep = ".")
> mydf
X.ovca Y.ctrls Z.ctrls A.ovca B.ctrls C.ovca
1 0.81030594 0.1624974 0.3977381 0.9619541 0.9866498 0.4424760
2 0.92498687 0.2069429 0.6065115 0.9969835 0.2407364 0.2455184
3 0.11033869 0.2878640 0.5662793 0.7936232 0.6066735 0.8210634
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With