I have many filenames which look like:
txt= "MA0051_IRF2.xml"
I want to extract IRF2
which is between "_" and ".". How do I do this in R?
To achieve this, you need a regexp that
.*
[_]
([^.]+)
[.]
.*
In your call to gsub, you then
\\1
(we need to escape the backslash, hence the double backslash)Example:
gsub(".*[_]([^.]+)[.].*", "\\1", "MA0051_IRF2.xml")
an other possibility with the stringr package:
str_extract(x, perl("(?<=_)(.+)(?=\\.)"))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With