I have this text where I want to identify only certain three digit numbers using my depart city (NYC) as the positive lookbehind expression. I don't want to include it or anything else in the result, other than the desired three digit number.
I can't simply use \d{3,} because there are other three digit numbers in this text I haven't included here which should not be in the output.
Example string:
Depart: NYC (etd 9/30), NJ (etd 10/4)
Arrive LAX
Rate: USD500, 700P
My regular expression
(?<=NYC)(\D|\S)*\d{3,}
Output
(etd 9/30), NJ (etd 10/4) Arrive LAX Rate: USD500, 700
However, I want it to output 700 only.
I've also tried
(?<=NYC)(?<=(\D|\S)*)\d{3,}
but this doesn't output anything.
You can use use
(?s)NYC.*?\b(\d{3,})
See the regex demo. Details:
(?s) - re.DOTALL inline modifierNYC - NYC word.*? - any zero or more chars as few as possible\b - a word boundary(\d{3,}) - Group 1: three or more digits.See the Python demo:
import re
text = """Depart: NYC (etd 9/30), NJ (etd 10/4)
Arrive LAX
Rate: USD500, 700P"""
m = re.search(r'(?s)NYC.*?\b(\d{3,})', text)
if m:
print(m.group(1))
# => 700
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With