Basically, I want to write a regex to match flight number with format AA123 or AA1234.
\b[A-Z]{2}\d{3,4}\b
That is two letters plus 3 or 4 digits. My solution and results are shown in the picture. I cannot understand why it fails when omitting spaces between words.
Results with spaces

Results without spaces debuggex

As Lucas mentions in the comment, the word boundaries \b account for the fact that your regex fails when there are no spaces around the flight codes.
Since you are using the pattern in Python, you can use lookarounds to restrict the enclosing context for the pattern. Say, the pattern should match if it is not preceded with an uppercase letter (as it should start with a capital letter) and should not be followed with a digit (as it should end with a digit).
In your case use
(?<![A-Z])[A-Z]{2}\d{3,4}(?!\d)
See the regex demo
The (?<![A-Z]) negative lookbehind will fail a match if there is an uppercase letter before the two flight number uppercase letters, and the (?!\d) negative lookahead will fail the match if the 3 or 4 digits after two uppercase letters are followed with a digit.
Other airline code regex considerations
Since the airline codes may be more complex than this, and include letters, too, but not just 2 digits at the start, and after the first 2 chars there may be an optiona whitespace and the final digits may be from 2 to 4, consider using
(?<![\dA-Z])(?!\d{2})([A-Z\d]{2})\s?(\d{2,4})(?!\d)
See another regex demo.
Details
(?<![\dA-Z]) - no letter or digit right before the current location(?!\d{2}) - no 2 digits allowed immediately to the right of the current location[A-Z\d]{2} - 2 digits or letters\s? - an optional whitespace\d{2,4} - two, three or four digits(?!\d) - no digit immediately to the right of the current location is allowed.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With