I have a character vector that contains text similar to the following:
text <- c("ABc.def.xYz", "ge", "lmo.qrstu")
I would like to remove everything before a .
:
> "xYz" "ge" "qrstu"
However, the grep
function seems to be treating .
as a letter:
pattern <- "([A-Z]|[a-z])+$"
grep(pattern, text, value = T)
> "ABc.def.xYz" "ge" "lmo.qrstu"
The pattern works elsewhere, such as on regexpal.
How can I get grep
to behave as expected?
grep
is for finding the pattern. It returns the index of the vector that matches a pattern. If, value=TRUE
is specified, it returns the value. From the description, it seems that you want to remove substring instead of returning a subset of the initial vector.
If you need to remove the substring, you can use sub
sub('.*\\.', '', text)
#[1] "xYz" "ge" "qrstu"
As the first argument, we match a pattern i.e. '.*\\.'
. It matches one of more characters (.*
) followed by a dot (\\.
). The \\
is needed to escape the .
to treat it as that symbol instead of any character. This will match until the last .
character in the string. We replace that matched pattern with a ''
as the replacement argument and thereby remove the substring.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With