Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Grab from beginning to first occurrence of character with gsub

Tags:

regex

r

I have the following regex that I'd like to grab everything from the beginning of the sentence until the first ##. I could use strsplit as I demonstrate to do this task but am preferring a gsub solution. If gusub is not the correct tool (I think it is though) I'd prefer a base solution because I want to learn the base regex tools.

x <- "gfd gdr tsvfvetrv erv tevgergre ## vev fe ## vgrrgf"

strsplit(x, "##")[[c(1, 1)]]  #works

gsub("(.*)(##.*)", "\\1", x)  #I want to work
like image 454
Tyler Rinker Avatar asked Nov 28 '12 15:11

Tyler Rinker


1 Answers

Just add one character, putting a ? after the first quantifier to make it "non-greedy":

gsub("(.*?)(##.*)", "\\1", x) 
# [1] "gfd gdr tsvfvetrv erv tevgergre "

Here's the relevant documentation, from ?regex

By default repetition is greedy, so the maximal possible number of repeats is used. This can be changed to 'minimal' by appending '?' to the quantifier.

like image 165
Josh O'Brien Avatar answered Sep 30 '22 18:09

Josh O'Brien