I have a vector filled with strings of the following format: <year1><year2><id1><id2>
the first entries of the vector looks like this:
199719982001
199719982002
199719982003
199719982003
For the first entry we have: year1 = 1997, year2 = 1998, id1 = 2, id2 = 001.
I want to write a regular expression that pulls out year1, id1, and the digits of id2 that are not zero. So for the first entry the regex should output: 199721.
I have tried doing this with the stringr package, and created the following regex:
"^\\d{4}|\\d{1}(?<=\\d{3}$)"
to pull out year1 and id1, however when using the lookbehind i get a "invalid regular expression" error. This is a bit puzzling to me, can R not handle lookaheads and lookbehinds?
Lookbehind has the same effect, but works backwards. It tells the regex engine to temporarily step backwards in the string, to check if the text inside the lookbehind can be matched there.
Unlike look-ahead, look-behind is used when the pattern appears before a desired match. You're “looking behind” to see if a certain string of text has the desired pattern behind it. If it does, then that string of text is a match.
Positive lookahead: In this type the regex engine searches for a particular element which may be a character or characters or a group after the item matched. If that particular element is present then the regex declares the match as a match otherwise it simply rejects that match.
Regex Lookbehind is used as an assertion in Python regular expressions(re) to determine success or failure whether the pattern is behind i.e to the right of the parser's current position. They don't match anything. Hence, Regex Lookbehind and lookahead are termed as a zero-width assertion.
Since this is fixed format, why not use substr? year1
is extracted using substr(s,1,4)
, id1
is extracted using substr(s,9,9)
and the id2
as as.numeric(substr(s,10,13))
. In the last case I used as.numeric
to get rid of the zeroes.
You can use sub.
sub("^(.{4}).{4}(.{1}).*([1-9]{1,3})$","\\1\\2\\3",s)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With