Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

appending or Pasting names to Column names in R

I Have a tab delim file with 400 columns.Now I want to append text to the column names.ie if there is column name is A and B,I want it to change A to A.ovca and B to B.ctrls.Like wise I want to add the texts(ovca and ctrls) to 400 coulmns.Some column names with ovca and some with ctrls.All the columns are unique and contains more than 1000 rows.A sample code of the delim file is given below:

         X             Y         Z               A       B               C  
        2.34          .89       1.4             .92     9.40            .82
        6.45          .04       2.55            .14     1.55            .04
        1.09          .91       4.19            .16     3.19            .56
        5.87          .70       3.47            .80     2.47            .90

And i want the file to be look like:

       X.ovca     Y.ctrls      Z.ctrls       A.ovca     B.ctlrs       C.ovca  
        2.34          .89       1.4             .92     9.40            .82
        6.45          .04       2.55            .14     1.55            .04
        1.09          .91       4.19            .16     3.19            .56
        5.87          .70       3.47            .80     2.47            .90

Please do help me

Regards Thileepan

like image 201
Dinesh Avatar asked Nov 06 '11 16:11

Dinesh


3 Answers

If you data.frame is called dat, you can access (and write to) the column names with colnames(dat).

Therefore:

cn <- colnames(dat)
cn <- sub("([AXC])","\\1.ovca",cn)
cn <- sub("([YZB])","\\1.ctrls",cn)
colnames(dat) <- cn

> cn
[1] "X.ovca"  "Y.ctrls" "Z.ctrls" "A.ovca"  "B.ctrls" "C.ovca" 

The \\1 is called back-substitution within your regular expression. It will replace \\1 with whatever's inside the parentheses in the pattern. Since inside the parentheses you have a bracket, it will match any of the letters inside. In this case, "A" becomes "A.ovca" and "X" becomes "X.ovca".

If your variable names are more than one letter, easy enough to extend; just look up a bit on regex's.

like image 178
Ari B. Friedman Avatar answered Nov 05 '22 21:11

Ari B. Friedman


Here is a two liner using the stringr package.

nam <- names(mydf)
names(mydf) <- ifelse(nam %in% c('X', 'A', 'Z'), 
   str_c(nam, '.ovca'),  str_c(nam, '.ctrls'))
like image 33
Ramnath Avatar answered Nov 05 '22 23:11

Ramnath


How about this? You basically find columns that you want to append "ovca" and "ctrls" using %in%, and append the appropriate tag.

> (mydf <- data.frame(X = runif(10), Y = runif(10), Z = runif(10), A = runif(10), B = runif(10), C = runif(10)))
            X         Y         Z         A         B         C
1  0.81030594 0.1624974 0.3977381 0.9619541 0.9866498 0.4424760
2  0.92498687 0.2069429 0.6065115 0.9969835 0.2407364 0.2455184
3  0.11033869 0.2878640 0.5662793 0.7936232 0.6066735 0.8210634

> names(mydf)[names(mydf) %in% c("X", "A", "C")] <- paste(names(mydf)[names(mydf) %in% c("X", "A", "C")], "ovca", sep = ".")
> names(mydf)[names(mydf) %in% c("Y", "Z", "B")] <- paste(names(mydf)[names(mydf) %in% c("Y", "Z", "B")], "ctrls", sep = ".")
> mydf
       X.ovca   Y.ctrls   Z.ctrls    A.ovca   B.ctrls    C.ovca
1  0.81030594 0.1624974 0.3977381 0.9619541 0.9866498 0.4424760
2  0.92498687 0.2069429 0.6065115 0.9969835 0.2407364 0.2455184
3  0.11033869 0.2878640 0.5662793 0.7936232 0.6066735 0.8210634
like image 37
Roman Luštrik Avatar answered Nov 05 '22 23:11

Roman Luštrik