I have to compare addresses from two tables and get the Id if the address matches. Each table has three columns Houseno, street, state The address are not in standard format in either of the tables. There are approx. 50,000 rows, I need to scan through
At some places its Ave. Avenue Ave . Str Street, ST. Lane Ln. Place PL Cir CIRCLE. Any combination with a dot or comma or spaces ,hypen. I was thinking of combining all three What can be best way to do it in SQL or PLSQL for example
table1
HNO STR State
----- ----- -----
12 6th Ave NY
10 3rd Aven SD
12-11 Fouth St NJ
11 sixth Lane NY
A23 Main Parkway NY
A-21 124 th Str. VA
table2
id HNO STR state
-- ----- ----- -----
1 12 6 Ave. NY
13 10 3 Avenue SD
15 1121 Fouth Street NJ
33 23 9th Lane NY
24 X23 Main Cir. NY
34 A1 124th Street VA
There is no simple way to achieve what you want. There is a expensive software (google for "address standardization software") that can do this but rarely 100% automatic.
What this type of software does is to take the data, use complex heuristics to try to figure out the "official" address and then return that (sometimes with the confidence that the result is correct, sometimes a list of results sorted by confidence).
For a small percentage of the data, the software will simply not work and you'll have to fix that yourself.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With