How to fill gap between two characters with regex

Question

I have a data set like below. I would like to replace all dots between two 1's with 1's, as shown in the desired.result. Can I do this with regex in base R?

I tried:

regexpr("^1\.1$", my.data$my.string, perl = TRUE)

Here is a solution in c#

Characters between two exact characters

Thank you for any suggestions.

my.data <- read.table(text='
     my.string                           state
     ................1...............1.    A
     ......1..........................1    A
     .............1.....2..............    B
     ......1.................1...2.....    B
     ....1....2........................    B
     1...2.............................    C
     ..........1....................1..    C
     .1............................1...    C
     .................1...........1....    C
     ........1....2....................    C
     ......1........................1..    C
     ....1....1...2....................    D
     ......1....................1......    D
     .................1...2............    D
', header = TRUE, na.strings = 'NA', stringsAsFactors = FALSE)

desired.result <- read.table(text='
     my.string                           state
     ................11111111111111111.    A
     ......1111111111111111111111111111    A
     .............1.....2..............    B
     ......1111111111111111111...2.....    B
     ....1....2........................    B
     1...2.............................    C
     ..........1111111111111111111111..    C
     .111111111111111111111111111111...    C
     .................1111111111111....    C
     ........1....2....................    C
     ......11111111111111111111111111..    C
     ....111111...2....................    D
     ......1111111111111111111111......    D
     .................1...2............    D
', header = TRUE, na.strings = 'NA', stringsAsFactors = FALSE)

hwnd · Accepted Answer

Below is an option using gsub with the \G feature and lookaround assertions.

> gsub('(?:1|\G(?<!^))\K\.(?=\.*1)', '1', my.data$my.string, perl = TRUE)
# [1] "................11111111111111111." "......1111111111111111111111111111"
# [3] ".............1.....2.............." "......1111111111111111111...2....."
# [5] "....1....2........................" "1...2............................."
# [7] "..........1111111111111111111111.." ".111111111111111111111111111111..."
# [9] ".................1111111111111...." "........1....2...................."
# [11] "......11111111111111111111111111.." "....111111...2...................."
# [13] "......1111111111111111111111......" ".................1...2............"

The \G feature is an anchor that can match at one of two positions; the start of the string position or the position at the end of the last match. Since it seems you want to avoid the dots at the start of the string position we use a lookaround assertion \G(?<!^) to exclude the start of the string.

The \K escape sequence resets the starting point of the reported match and any previously consumed characters are no longer included.

You can find an overall breakdown that explains the regular expression here.

How to fill gap between two characters with regex

Tags:

regex

r

Mark Miller

1 Answers

hwnd

Recent Activity

Donate For Us

How to fill gap between two characters with regex

Tags:

regex

r

Mark Miller

1 Answers

hwnd

Related questions

Recent Activity

Donate For Us