I have a data frame, the start of it is below:
SM_H1455 SM_V1456 SM_K1457 SM_X1461 SM_K1462
ENSG00000000419.8 290 270 314 364 240
ENSG00000000457.8 252 230 242 220 106
ENSG00000000460.11 154 158 162 136 64
ENSG00000000938.7 20106 18664 19764 15640 19024
ENSG00000000971.11 30 10 4 2 10
Note that there are many more cols and rows.
Here's what I want to do: I want to change the name of the columns. The most important information in a column's name, e.g. SM_H1455, is the 4th character of the character string. In this case it's a H. What I want to do is to change the "SM" part to "Control" if the 4th character is "H" or "K", and "Case" if the 4th column is "X" or "V". I'd like to keep everything else in the name. So that in the end, I'd like a table like this:
Control_H1455 Case_V1456 Control_K1457 Case_X1461 Control_K1462
ENSG00000000419.8 290 270 314 364 240
ENSG00000000457.8 252 230 242 220 106
ENSG00000000460.11 154 158 162 136 64
ENSG00000000938.7 20106 18664 19764 15640 19024
ENSG00000000971.11 30 10 4 2 10
Please keep in mind that whether the 4th character is "V", "X", "K" or "H" is completely random.
I'd appreciate any help! Thanks.
One way, where x
is your df:
controls <- which(substring(names(x),4,4) %in% c("H","K"))
cases <- which(substring(names(x),4,4) %in% c("X","V"))
names(x)[controls] <- gsub("SM","Control",names(x)[controls])
names(x)[cases] <- gsub("SM","Case",names(x)[cases])
Alternatively:
names(x) <- sapply(names(x),function(z) {
if(substring(z,4,4) %in% c("H","K"))
sub("SM","Control",z)
else if(substring(z,4,4) %in% c("X","V"))
sub("SM","Case",z)
})
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With