Trying to make a regex that can handle input like either:
I have this:
^(.+)[,\\s]+(.+)\s+(\d{5})?$
It works for the #2 case, but not #1. If I change the \s+ to \s* then it works for #1 but not #2.
You can play around with it here: http://rubular.com/r/oqKBJ4r8cq
Try this:
^(.+)[,\\s]+(.+?)\s*(\d{5})?$
http://rubular.com/r/qS0e5vAQnT
Try this instead:
^([^,]+),\s([A-Z]{2})(?:\s(\d{5}))?$
This expression works on both examples, captures each piece of the address in separate groups, and properly handles whitespace.
Here is how it breaks down:
^ # anchor to the start of the string
([^,]+) # match everything except a comma one or more times
, # match the comma itself
\s # match a single whitespace character
([A-Z]{2}) # now match a two letter state code
(?: # create a non-capture group
\s # match a single whitespace character
(\d{5}) # match a 5 digit number
)? # this whole group is optional
$ # anchor to the end of the string
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With