How can I remove the letters between two specific patterns in R?
For instance
a= "a#g abcdefgtdkfef_jpg>pple"
I would like to remove all the letters between #g
and jpg>
a1="apple"
I tried to find some function in stringr
but I couldn't
How to remove a character or multiple characters from a string in R? You can either use R base function gsub() or use str_replace() from stringr package to remove characters from a string or text.
To remove dot and number at the end of the string, we can use gsub function. It will search for the pattern of dot and number at the end of the string in the vector then removal of the pattern can be done by using double quotes without space.
If we need to remove the first character, use sub , match one character ( . represents a single character), replace it with '' . Or for the first and last character, match the character at the start of the string ( ^. ) or the end of the string ( .
We will remove non-alphanumeric characters by using str_replace_all() method. [^[:alnum:]] is the parameter that removes the non-alphanumeric characters.
There's no need to load a package for this operation. You can use the base R function sub
. It's used to match the first occurrence of a regular expression.
a <- "a#g abcdefgtdkfef_jpg>pple"
sub("#g.*jpg>", "", a)
# [1] "apple"
Regular expression explained:
#g
matches "#g"
.*
matches any character except \n
(zero or more times)jpg>
matches "jpg>"
So here we're removing everything starting at #g
up to and including jpg>
In regards to your comment
I tried to find some function in stringR but I couldn't
It's actually spelled stringr
(case-sensitive). You could use str_replace
.
library(stringr)
str_replace(a, "#g.*jpg>", "")
# [1] "apple"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With