I would like to capitalize everything in a character vector that comes after the first _
. For example the following vector:
x <- c("NYC_23df", "BOS_3_rb", "mgh_3_3_f")
Should come out like this:
"NYC_23DF" "BOS_3_RB" "mgh_3_3_F"
I have been trying to play with regular expressions, but am not able to do this. Any suggestions would be appreciated.
Convert string from lowercase to uppercase in R programming – toupper() function. toupper() method in R programming is used to convert the lowercase string to uppercase string. Return: Returns the uppercase string.
str_to_title() Function in R Language is used to convert the first letter of every word of a string to Uppercase and the rest of the letters are converted to lower case. Note: This function uses 'stringr' library.
There should be no difference.
Python String capitalize() method returns a copy of the original string and converts the first character of the string to a capital (uppercase) letter, while making all other characters in the string lowercase letters.
You were very close:
gsub("(_.*)","\\U\\1",x,perl=TRUE)
seems to work. You just needed to use _.*
(underscore followed by zero or more other characters) rather than _*
(zero or more underscores) ...
To take this apart a bit more:
_.*
gives a regular expression pattern that matches an underscore _
followed by any number (including 0) of additional characters; .
denotes "any character" and *
denotes "zero or more repeats of the previous element"()
denotes that it is a pattern we want to store\\1
in the replacement string says "insert the contents of the first matched pattern", i.e. whatever matched _.*
\\U
, in conjunction with perl=TRUE
, says "put what follows in upper case" (uppercasing _
has no effect; if we wanted to capitalize everything after (for example) a lower-case g, we would need to exclude the g from the stored pattern and include it in the replacement pattern: gsub("g(.*)","g\\U\\1",x,perl=TRUE)
)For more details, search for "replacement" and "capitalizing" in ?gsub
(and ?regexp
for general information about regular expressions)
gsubfn
in the gsubfn package is like gsub
except the replacement string can be a function. Here we match _ and everything afterwards feeding the match through toupper
:
> library(gsubfn)
>
> gsubfn("_.*", toupper, x)
[1] "NYC_23DF" "BOS_3_RB" "mgh_3_3_F"
Note that this approach involves a particularly simple regular expression.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With