I have a set of strings that are file names. I want to extract all characters after the # symbol but before the file extension. For example, one of the file names is:
HelloWorld#you.txt
I would want to return the stringyou
Here is my code:
hashPos = grep("#", name, fixed=TRUE)
dotPos = length(name)-3
finalText = substring(name, hashPos, dotPos)
I read online that grep
is supposed to return the index where the first parameter occurs (in this case the # symbol). So, I was expecting the above to work but it does not.
Or how would I use a regular expression to extract this string? Also, what happens when the string does not have a # symbol? Would the function return a special value such as -1?
Here is a one-liner solution
gsub(".*\\#(.*)\\..*", "\\1", c("HelloWorld#you.txt"))
Output:
you
To explain the code, it matches everything up to #
and then extracts all word characters up to .
, so the final output will be the in-between string which what you are looking for.
Edit:
The above solution matches file name up to the last .
i.e. allow file name to have multiple dots. If you want to extract the name up to the first .
you can use the regex .*\\#(\\w*)\\..*
instead.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With