Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to store distances between places?

Tags:

php

mysql

storage

I have a database containing places, and I need to show distances from any place to other ones in my webpage. Having the distances stored somewhere will save lots of work (loading them should be easier than computing them anew). But how to save the square matrix of the distances? Creating a new column every time I insert a new row doesn't seem to be a good solution, but I didn't find any better solution (though I can think of workarounds such as computing some 10 or 20 nearest distances and assuming I will rarely need more).

What is optimal way to save square tables of variable (and growing) size in PHP/MySQL? Or is there no good solution and my (or some other) workaround is better?

like image 506
Pavel V. Avatar asked Sep 13 '13 12:09

Pavel V.


People also ask

How do they determine distance between cities?

Similarly, for the mileage log on the official state highway map, DOT uses GIS software to determine the center point. Then they pick the nearest major intersection, and use that point for computing distances between cities.

How far apart two points or places are?

The distance between any two points is the length of the line segment joining the points. There is only one line passing through two points. So, the distance between two points can be calculated by finding the length of this line segment connecting the two points.

How do you find the distance between two addresses in Python?

Using geopy. distance. distance((lat_1, lon_1), (lat_2, lon_2)) returns the distance on the surface of a space object like Earth. You can choose whether you want the distance in kilometers , miles , nautical miles or feet .


2 Answers

Edit Note: As was mentioned in a comment, once you get enough places it may make more sense to just store long/lat values and calculate the distance on the fly based on those instead. However the solution explained here may still be relevant for other applications.


The best way to handle this would be with a pivot table, where each row has two place id's, and a distance value.

Now, since Distance A-B is the same as B-A, we only need to store each pairing once. We can do this by only ever storing a distance if the ID of A is less than B.


SETUP

First a places table to store your places

id | name
---+---------
 1 | Place_A
 2 | Place_B
 3 | Place_C
 4 | Place_D

Then a places_distances Pivot table:

place_id_1 | place_id_2 | distance
-----------+------------+----------
         1 |          2 | 10.0
         1 |          3 | 20.0
         1 |          4 | 15.0
         2 |          3 | 12.0
         2 |          4 |  8.0
         3 |          4 | 14.0

Note that pivot tables do not need their own ID field (though some may argue it's still good to have one sometimes). You will set up a unique key as follows (you'll want to look into the documentation for correct usage):

UNIQUE KEY `UNIQUE_placesDistances_primary`(`place_id_1`,`place_id_2`)

This ensures that you cannot have the same place/place paring in the table twice.

You will also want to make sure to set up foreign keys:

CONSTRAINT FOREIGN KEY `FK_placesDistances_place1` (`place_id_1`) 
    REFERENCES `places`(`id`),
CONSTRAINT FOREIGN KEY `FK_placesDistances_place2` (`place_id_2`)
    REFERENCES `places`(`id`)

Which will ensure that you can only add entries for place you actually have defined in places. it also means (if you use the default foreign key behavior) that you can't delete a place if you have distance row referencing that place.


Usage Examples

Looking up the distance between two places

(Given two variables @id_1 as the id of the first place and @id_2 as the id of the second place)

SELECT `distance`
FROM `places_distances`
WHERE (`place_id_1` = @id_1 AND `place_id_2` = @id_2)
    OR (`place_id_2` = @id_1 AND `place_id_11` = @id_2)
LIMIT 1;

We use the OR to account for the case where we try to look up distance 2 to 1 and not 1 to 2 - remember, we only store values where the first place's id is less than the second to avoid storing duplicates.


Inserting a new distance

(Given three variables @id_1 as the id of the first place and @id_2 as the id of the second place, and @distance being the distance)

INSERT `places_distances`(`place_id_1`,`place_id_2`,`distance`)
    VALUES(LEAST(@id_1, @id_2),GREATEST(@id_1, @id_2), @distance)

We're using the built in comparison functions LEAST and GREATEST to help maintain our rule that we only store places where the first ID is less than the second, to avoid duplicates.


Showing a list of place names, sorted by their distance from furthest to closest

To get the original names from the places table to show up in our places_distances query we have to join them together. In this case LEFT JOIN is the best choice since we only care about what is in the places_distances table. For more info on MySQL joins check here.

SELECT 
    `p_1`.`name` AS `place_1`,
    `p_2`.`name` AS `place_2`,
    `distance`
FROM `places_distances`
LEFT JOIN `places` AS `p_1`
    ON `distances`.`place_id_1` = `p_1`.`id`
LEFT JOIN `places` AS `p_2`
    ON `distances`.`place_id_2` = `p_2`.`id`
ORDER BY `distance` DESC

Which should return a table like this:

place_id_1 | place_id_2 | distance
-----------+------------+----------
   Place_A |    Place_C | 20.0
   Place_A |    Place_D | 15.0
   Place_C |    Place_D | 14.0
   Place_B |    Place_C | 12.0
   Place_A |    Place_B | 10.0
   Place_B |    Place_D |  8.0

showing a table of places and their distances to a specific given place

This is a bit more tricky since we need to show the name in a row that is not our input place, but we can use another useful function IF(CONDITION,'TRUE_OUTPUT','FALSE_OUTPUT') to do that.

(@place_name being the variable containing the place name, in this case 'Place_B')

SELECT 
    IF(`p_1`.`name`=@place_name, `p_2`.`name`, `p_1`.`name`) AS `name`,
    `distance`
FROM `places_distances`
LEFT JOIN `places` AS `p_1`
    ON `distances`.`place_id_1` = `p_1`.`id`
LEFT JOIN `places` AS `p_2`
    ON `distances`.`place_id_2` = `p_2`.`id`
WHERE `p_1`.`name` = @place_name OR `p_2`.`name` = @place_name
ORDER BY `distance` DESC

Which should return a table like this:

   name | distance
--------+-----------
Place_C | 12.0
Place_A | 10.0
Place_D |  8.0
like image 113
Johannes Avatar answered Sep 29 '22 12:09

Johannes


I would store the lat/long for all places and write a function to calculate distance between them with the lat/long information.

In this way, no need to calculate the distances for new places you want to add in you DB.

Moreover if you have lots of places, using a pivot table to store only the distances, you have to be aware that this table can grow very fast. As you need to cover all combinaisons of places.

For instance: for 1000 places you will have 1000 * 1000 - 1000 = 999000 rows in your table. Do the math for larger number but this table might contain a lot a rows depends on how many places you've got.

like image 24
TheEwook Avatar answered Sep 29 '22 12:09

TheEwook