s <- "1-343-43Hello_2_323.14_fdh-99H"
In R I want to use a regex to get the substring before the, say 2nd, underscore. How can this be done with one regex ? The alternative would be to split by '_' and then paste the first two - something along;
paste(sapply(strsplit(s, "_"),"[", 1:2), collapse = "_")
Gives:
[1] "1-343-43Hello_2"
But how can I make a regex expression to do the same ?
In general, for answering to the question in title, is
sub("^(([^_]*_){n}[^_]*).*", "\\1", s)
where n
is the number of _
you are allowing.
You can use a sub
:
sub("^([^_]*_[^_]*).*", "\\1", s)
See the regex demo
R code demo:
s <- "1-343-43Hello_2_323.14_fdh-99H"
sub("^([^_]*_[^_]*).*", "\\1", s)
## => [1] "1-343-43Hello_2"
Pattern details:
^
- start of string([^_]*_[^_]*)
- Group 1 capturing 0+ characters other than _
, then a _
and again 0+ non-_
s.*
- rest of the string (note that the TRE regex .
matches newlines, too).The \\1
replacement only returns the value inside Group 1.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With