Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

regex - return all before the second occurrence

Tags:

regex

r

Given this string:

DNS000001320_309.0/121.0_t0

How would I return everything before the second occurrence of "_"?

DNS000001320_309.0/121.0

I am using R.

Thanks.

like image 598
James Avatar asked Sep 16 '11 19:09

James


1 Answers

The following script:

s <- "DNS000001320_309.0/121.0_t0"
t <- gsub("^([^_]*_[^_]*)_.*$", "\\1", s)
t

will print:

DNS000001320_309.0/121.0

A quick explanation of the regex:

^         # the start of the input
(         # start group 1
  [^_]*   #   zero or more chars other than `_`
  _       #   a literal `_`
  [^_]*   #   zero or more chars other than `_`
)         # end group 1
_         # a literal `_`
.*        # consume the rest of the string
$         # the end of the input

which is replaced with:

\\1       # whatever is matched in group 1

And if there are less than 2 underscores, the string is not changed.

like image 187
Bart Kiers Avatar answered Sep 30 '22 21:09

Bart Kiers