Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using RegEx and Replace to update address fields with USPS abbreviations in MS-Access

I'm trying to write a VBA function in Access that replaces words in an address field with the standard United States Postal Abbreviations. I realize this is never going to be perfect, but I want to at least make simple abbreviations (without having to purchase address formatting software), e.g.

 input      output
 -------    -------------
 North   -> N
 Street  -> ST
 Drive   -> DR
 Lane    -> LN

I thought about using a simple table to store the string and the replacement string, and then looping through that table/recordset to perform a simple search and replace using the Replace() function, e.g. using the immediate window:

 ?Replace("123 North 3rd St", "North", "N", compare:=vbTextCompare)
 123 N 3rd St

However, this method can potentially cause errors, e.g.

 ?Replace("123 Northampton St", "North", "N", compare:=vbTextCompare)
 123 Nampton St

My original strategy was to create a replacement table with regular expression patterns and replacement strings, then loop through that table to do a more precise search and replace.

pattern                 abbrev
-------------------     ------------
{pattern for North}     N
{pattern for Street}    ST

I realized that RegEx might be overkill here, especially since I'm going to be looping through address fields over and over in a database, but couldn't think of an easier way just using the Replace() function (Update: see responses from @mwolfe02 and @Cylian, and a hybrid solution).

In the above example, I want to search for the words North and Street when they are either as a exist as word in a string (thus separated by two white spaces) or at the end of the string or beginning of a string. This covers most of the situations that warrant an abbreviation. e.g.

address                       formatted
----------------------        --------------------------
123 North 3rd St           -> 123 N 3RD ST
123 ABC Street North       -> 123 ABC ST N
North 3rd Street           -> N 3RD ST
123 North Northampton St   -> 123 N NORTHAMPTON ST

As in these examples, I want to replace all instances of the pattern in the string. I also am converting everything to upper case (I can use UCase() on the final result no problem).

Does anyone know of an existing module that does this sort of thing? Can anyone help with the pattern matching as in the above examples? For extra credit, I'm curious also about creating rule in the table to format post office boxes, e.g.

address                   formatted
----------------------    --------------------------
P.O. Box 345           -> PO BOX 345
PO Box 345             -> PO BOX 345
Post Office Box 345    -> PO BOX 345
PO. Box 345            -> PO BOX 345
P. O. Box 345          -> PO BOX 345

This stack overflow post gives the following pattern to recognize some PO boxes "^\s*P.?\s?O.?\sB[Oo][Xx]." (admittedly not the third example above). Again, I'm not as comfortable with matching and replacement sets to figure out how to write this more precise replace function. Is there a RegEx/Access expert who can help?

like image 664
regulus Avatar asked May 29 '12 09:05

regulus


2 Answers

Try this function

Public Function FormatPO(inputString$)
'This example uses **Microsoft VBScript Regular Expressions 5.5**
Dim re As New RegExp, result$
With re
    .Pattern = "\bP(?:[. ]+|ost +)?O(?:ff\.?(?:ice))?[. ]+B(?:ox|\.) +(\d+)\b"
    .Global = True
    .IgnoreCase = True
    If .test(inputString) Then
        FormatPO = .Replace(inputString, "PO BOX $1")
    Else
        MsgBox "Data doesn't matched!"
    End If
End With

and could be called as (from immediate window)

?FormatPO("P.O. Box 563")

gives result

PO BOX 563

Matching pattern for Street names with addresses need more time to built. But you could visit here and build your RegEx online.

Hope this helps.

like image 95
Cylian Avatar answered Sep 28 '22 05:09

Cylian


@Cylian has a good answer for the second part of your question. I'll try to address the first. If your only concern is that you replace whole words in the address then the following function will do what you need:

Function AddressReplace(AddressLine As String, _
                        FullName As String, _
                        Abbrev As String)
    AddressReplace = Trim(Replace(" " & AddressLine & " ", _
                                  " " & FullName & " ", _
                                  " " & Abbrev & " "))
End Function

It encloses the address line in an opening and closing space, so that you can require an opening and closing space on each word you are trying to replace. It finishes up with a trim to get rid of those temporary spaces.

The following procedure tests the code and produces the output you are looking for:

Sub TestAddressReplace()
    Debug.Print AddressReplace("123 North 3rd St", "North", "N")
    Debug.Print AddressReplace("123 Northampton St", "North", "N")
End Sub
like image 40
mwolfe02 Avatar answered Sep 28 '22 04:09

mwolfe02