Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

gsub - reduce all repeating characters to one instance

Tags:

regex

r

gsub

A slightly odd question for you all - I have solved my issue of wishing to replace all repeating characters in a string, but I don't really understand my solution. Example is:

txt <- "haarbbbbbbijjjjjan"
gsub("([a-z])\\1+", "\\1", txt)
[1] "harbijan"

Is this just matching all repeated instances of each letter (search term + repeats of search term) and replacing them with the searched for letter? Or is this doing something unintended that I don't fully grasp?

like image 640
thelatemail Avatar asked Sep 16 '25 22:09

thelatemail


1 Answers

You've declared one group - any symbol between a and z. \\1 references this group. Any number of repeatances of this group is substituted into the group value. For example, if group is a, then any number of as will be replaced with group value, e.g. with a.

Hope I made myself clear =)

like image 144
Andrew Logvinov Avatar answered Sep 18 '25 13:09

Andrew Logvinov