Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compare two addresses which are not in standard format

I have to compare addresses from two tables and get the Id if the address matches. Each table has three columns Houseno, street, state The address are not in standard format in either of the tables. There are approx. 50,000 rows, I need to scan through

At some places its Ave. Avenue Ave . Str Street, ST. Lane Ln. Place PL Cir CIRCLE. Any combination with a dot or comma or spaces ,hypen. I was thinking of combining all three What can be best way to do it in SQL or PLSQL for example

table1

     HNO         STR          State
     -----       -----         ----- 
      12        6th Ave         NY
      10        3rd Aven        SD
      12-11     Fouth St        NJ                         
      11        sixth Lane      NY
      A23       Main Parkway    NY
      A-21      124 th Str.     VA

table2

      id   HNO         STR          state
     --    -----       -----         ----- 
     1      12        6 Ave.         NY
     13     10        3 Avenue       SD
     15     1121      Fouth Street   NJ                         
     33     23        9th Lane       NY
     24     X23       Main Cir.      NY
     34     A1       124th Street    VA
like image 556
Rajiv A Avatar asked Oct 22 '22 15:10

Rajiv A


1 Answers

There is no simple way to achieve what you want. There is a expensive software (google for "address standardization software") that can do this but rarely 100% automatic.

What this type of software does is to take the data, use complex heuristics to try to figure out the "official" address and then return that (sometimes with the confidence that the result is correct, sometimes a list of results sorted by confidence).

For a small percentage of the data, the software will simply not work and you'll have to fix that yourself.

like image 124
Aaron Digulla Avatar answered Oct 24 '22 12:10

Aaron Digulla