Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

remove everything after first occurence of a pattern regex gsub R

Tags:

regex

r

This might be super easy but I still cannot find the answer. I would like to remove everything before the first "que" in my string:

What I am doing:

v <- c("blabla que 1", "blabla que eu Boqueirão que ")
gsub(".*que", "", v)
# [1] " 1"   "irão"

What I want is "1" e "eu Boqueirão que ". When I try .*^que it does make any effect. Thank you for your help.

like image 416
jvqp Avatar asked Jan 19 '26 22:01

jvqp


1 Answers

To remove all up to and including the first occurrence of a pattern use

sub(".*?que", "", v)
[1] " 1"                 " eu Boqueirao que "

If you also need to remove any 0+ whitespace after use

sub(".*?que\\s*", "", v, perl=TRUE)
## => [1] "1"                 "eu Boqueirao que "

Note that perl=TRUE is important here as the .*?que\s* TRE regex disables greediness with the first non-greedy quantifier *? on the current level, and \s* acts as a non-greedy pattern, and non-greedy patterns at the end of a regex never match any strings.

BONUS: Just in case you want to remove all text up to the first occurrence of a pattern excluding this pattern wrap the part you need to keep with capturing parentheses and use \1 in the replacement pattern:

sub(".*?(que)", "\\1", v)
## => [1] "que 1"                 "que eu Boqueirao que "
like image 86
Wiktor Stribiżew Avatar answered Jan 22 '26 12:01

Wiktor Stribiżew



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!