I'm using Ruby's scan() method to find text in a particular format. I then output it into a string separated by commas. The text I'm trying to find would look like this:
AB_ABCD_123456
Here's the what I've come up with so far to find the above. It works fine:
text.scan(/.._...._[0-9][0-9][0-9][0-9][0-9][0-9]/)
puts text.uniq.sort.join(', ')
Now I need a regex that will find the above with or without a two-letter country designation at the end. For example, I would like to be able to find all three of the below:
AB_ABCD_123456
AB_ABCD_123456UK
AB_ABCD_123456DE
I know I could use two or three different scans to achieve my result, but I'm wondering if there's a way to get all three with one regex.
/.._...._\d{6}([A-Z]{2})?/
Why not just use split?
"AB_ABCD_123456".split(/_/).join(',')
Handles the cases you listed without modification.
/.._...._[0-9][0-9][0-9][0-9][0-9][0-9](?:[A-Z][A-Z])?/
You can also use {} to make the regex shorter:
/.{2}_.{4}_[0-9]{6}(?:[A-Z]{2})?/
Explanation: ?
makes the preceding pattern optional. ()
groups expressions together (so ruby knows the ?
applies to the two letters). The ?:
after the opening (
makes the group non-capturing (capturing groups would change the values yielded by scan).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With