I have the following code:
input <- "1-FA-1-I2-1-I2-1-I2-1-EX-1-I2-1-I3-1-FA-1-I1-1-I2-1-TR-1-I1-1-I2-1-FA-1-I3-1-I1-1-FA-1-FA-1-NR-1-I3-1-I2-1-TR-1-I1-1-I2-1-I1-1-I2-1-FA-1-I2-1-I1-1-I3-1-FA-1-QU-1-I1-1-I2-1-I2-1-I2-1-NR-1-I2-1-I2-1-NR-1-I1-1-I2-1-I1-1-NR-1-I3-1-QU-1-I2-1-I3-1-QU-1-NR-1-I2-1-I1-1-NR-1-QU-1-QU-1-I2-1-I1-1-EX"
innovation_patterns <- gsub(input, pattern = "-1-", replacement = "-")
innovation_patterns <- lapply(innovation_patterns, str_extract_all, '(?:I\\d-?)*I3(?:-?I\\d)*')
This outputs:
"I2-I3" "I3-I1" "I3-I2" "I2-I1-I3" "I3" "I2-I3"
However, I only want to extract matches to the regex that are following immediately to a specific string, e.g.:
only match the regex when it's preceded by the literal string FA-I2-I2-I2-EX
.
This, for example, would be the first match of the regex, while the second match is preceded by FA-I1-I2-TR-I1-I2-FA
.
The expected output is roughly the same as in the regex above, but only selecting one of the 5 matches, because it needs to be preceded by a specific literal string.
How can I modify this regex to achieve this purpose? I assume it needs to use a positive lookbehind to first identify the literal string, then execute the regex.
I don't know if I'm fully understanding what you mean, but it seems you could use positive lookbehind.
For instance:
(?<=a)b
(positive lookbehind) matches theb
(and only theb
) incab
, but does not matchbed
ordebt
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With