Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove characters after the last occurrence of a specific character

Tags:

regex

r

I have a string that looks like:

exampleList <- c("rs40535:1745233:G:A_AGGG","rs41111:1733320:GAC:AAC_TTTTTTG", "exm2344379:1724237:A:T_A", "exm-rs234380:1890910:A:G_A", "rs423444419_T","psy_rs73453432_TCCC","22:1701234072:C:T_C","9:4534345:rs2342342_G","chr10_rs7287862_C","psy_rs7291672_A")  

I wish to remove everything after the last underscore ( _ ) so my result looks something like this:

[1] "rs40535:1745233:G:A"      "rs41111:1733320:GAC:AAC"  "exm2344379:1724237:A:T"   "exm-rs234380:1890910:A:G"   "rs423444419"              "psy_rs73453432"           "22:1701234072:C:T"        "9:4534345:rs2342342"     "chr10_rs7287862"          "psy_rs7291672"    

I've tried the following, but this removes everything after the first _.

gsub("\\_.*$","",exampleList) 

I recognize there are similar posts but none I could find in R.

like image 659
Sheila Avatar asked Jun 21 '17 22:06

Sheila


People also ask

How do I extract text before and after a specific character in Excel?

Extract text before or after space with formula in Excel Select a blank cell, and type this formula =LEFT(A1,(FIND(" ",A1,1)-1)) (A1 is the first cell of the list you want to extract text) , and press Enter button.

How do I exclude the last two characters in Excel?

To delete the first or last n characters from a string, this is what you need to do: On the Ablebits Data tab, in the Text group, click Remove > Remove by Position. On the add-in's pane, select the target range, specify how many characters to delete, and hit Remove.


1 Answers

Figured it out!

outcome <- sub("_[^_]+$", "", exampleList)
like image 106
Sheila Avatar answered Sep 19 '22 17:09

Sheila