I'm trying to use the stringr package in R to extract everything from a string up until the first occurrence of an underscore.
What I've tried
str_extract("L0_123_abc", ".+?(?<=_)")
> "L0_"
Close but no cigar. How do I get this one? Also, Ideally I'd like something that's easy to extend so that I can get the information in between the 1st and 2nd underscore and get the information after the 2nd underscore.
To get L0
, you may use
> library(stringr)
> str_extract("L0_123_abc", "[^_]+")
[1] "L0"
The [^_]+
matches 1 or more chars other than _
.
Also, you may split the string with _
:
x <- str_split("L0_123_abc", fixed("_"))
> x
[[1]]
[1] "L0" "123" "abc"
This way, you will have all the substrings you need.
The same can be achieved with
> str_extract_all("L0_123_abc", "[^_]+")
[[1]]
[1] "L0" "123" "abc"
The regex lookaround should be
str_extract("L0_123_abc", ".+?(?=_)")
#[1] "L0"
Using gsub...
gsub("(.+?)(\\_.*)", "\\1", "L0_123_abc")
You can use sub
from base using _.*
taking everything starting from _
.
sub("_.*", "", "L0_123_abc")
#[1] "L0"
Or using [^_]
what is everything but not _
.
sub("([^_]*).*", "\\1", "L0_123_abc")
#[1] "L0"
or using substr
with regexpr
.
substr("L0_123_abc", 1, regexpr("_", "L0_123_abc")-1)
#substr("L0_123_abc", 1, regexpr("_", "L0_123_abc", fixed=TRUE)-1) #More performant alternative
#[1] "L0"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With