I have this address:
Grimshaw Lane, Bollington, Macclesfield SK10 5JB,
Looking for this address, I obtain this (from an API):
Bollington Wharf, Grimshaw Lane, Bollington, United Kingdom
I know how work preg_match, but I believe there's must be anyway to compare two similars texts (similar, not the same), and decide if they are the same address (even if they are a little differents).
There's obviously no solution that's going to get you 100% reliable results, but why not try this: Send both strings to Google Maps via wget and compare the results. Google has invested, at the least, tens of thousands of man-hours into solving the problem that you're looking at, why not just let them deal with it?
I'm not sure if this helps, but I would consider using a combination of using explode to create multiple strings in an array an levenshtein() to compare the different elements of the array().
It depends on how many arrays you would have to compare, but if you just have a few (NOT thousands)
Psudo code would be something like this:
$search_address = "Bollington Wharf, Grimshaw Lane, Bollington, United Kingdom";
$my_addresses = Array("Grimshaw Lane, Bollington, Macclesfield SK10 5JB",
"Different Lane, YabbaDabbaDoo, Otherfield SK12 6BJ",
...);
$search_array = explode(',', $search_address);
$best_address = array();
$lowest_compare_value = 1000;
$lowest_compare_address = array();
foreach($my_addresses as $key => $my_address) {
$current_address_array = explode(',', $value);
$compare_value = 0;
foreach(<elements in $my_address>) {
$lowest_value = 1000;
foreach(<elements in $search_array) {
$new_value = levenshtein($search_element, $my_element);
if ($new_value < $lowest_value) { $lowest_value = $new_value; }
}
$compare_value += $lowest_value;
}
if($compare_value < $lowest_compare_value) {
$lowest_compare_value = $compare_value
$lowest_compare_address = $my_address;
}
}
Now you should also consider what maximum plausible levenshtein value could be to check if compared address is too far off.
As mentioned this method takes time and should NOT be used in an application that needs a lot of speed or if you have many local addresses.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With