I have this text:
F <- "hhhappy birthhhhhhdayyy"
and I want to remove the repeat characters, I tried this code
https://stackoverflow.com/a/11165145/10718214
and it works, but I need to remove repeat characters if it repeats more than 2, and if it repeated 2 times keep it.
so the output that I expect is
"happy birthday"
any help?
If we want to know whether a given string has repeated characters, the simplest way is to use the existing method of finding first occurrence from the end of the string, e.g. lastIndexOf in java. In Python, the equivalence would be rfind method of string type that will look for the last occurrence of the substring.
We can use string replace() function to replace a character with a new character. If we provide an empty string as the second argument, then the character will get removed from the string.
Although the String class doesn't have a remove() method, you can use variations of the replace() method and the substring() method to remove characters from strings.
Try using sub
, with the pattern (.)\\1{2,}
:
F <- ("hhhappy birthhhhhhdayyy")
gsub("(.)\\1{2,}", "\\1", F)
[1] "happy birthday"
Explanation of regex:
(.) match and capture any single character
\\1{2,} then match the same character two or more times
We replace with just the single matching character. The quantity \\1
represents the first capture group in sub
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With