Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove all text before colon

Tags:

replace

unix

r

sed

awk

I have a file containing a certain number of lines. Each line looks like this:

TF_list_to_test10004/Nus_k0.345_t0.1_e0.1.adj:PKMYT1 

I would like to remove all before ":" character in order to retain only PKMYT1 that is a gene name. Since I'm not an expert in regex scripting can anyone help me to do this using Unix (sed or awk) or in R?

like image 896
Elb Avatar asked Sep 06 '12 10:09

Elb


People also ask

How do you remove text before a colon in Excel?

Press Ctrl + H to open the Find and Replace dialog. In the Find what box, enter one of the following combinations: To eliminate text before a given character, type the character preceded by an asterisk (*char). To remove text after a certain character, type the character followed by an asterisk (char*).

How do I remove text before a comma in Excel?

In the 'Find what' field, enter ,* (i.e., comma followed by an asterisk sign) Leave the 'Replace with' field empty. Click on the Replace All button.


2 Answers

Here are two ways of doing it in R:

foo <- "TF_list_to_test10004/Nus_k0.345_t0.1_e0.1.adj:PKMYT1"  # Remove all before and up to ":": gsub(".*:","",foo)  # Extract everything behind ":": regmatches(foo,gregexpr("(?<=:).*",foo,perl=TRUE)) 
like image 156
Sacha Epskamp Avatar answered Sep 21 '22 17:09

Sacha Epskamp


A simple regular expression used with gsub():

x <- "TF_list_to_test10004/Nus_k0.345_t0.1_e0.1.adj:PKMYT1" gsub(".*:", "", x) "PKMYT1" 

See ?regex or ?gsub for more help.

like image 38
Andrie Avatar answered Sep 18 '22 17:09

Andrie