Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to determine the center zip code for each state? [closed]

Tags:

php

mysql

I have a Zip Code MySQL database that has the lat/lng of each zip code. How do I go about finding the center zip code for each state? I would be doing this in PHP.

like image 239
Edward Avatar asked Oct 23 '12 16:10

Edward


People also ask

Are ZIP codes close to each other?

ZIP codes are not as geographically scattered as area codes in the US, but they are not a coordinate system. The only exception is that the ZIP+4 codes are sub-sections of the larger ZIP code. You can assume that any ZIP+4 codes that have the same ZIP code are close to each other.

What is a ZIP code centroid?

The ZIP Code Population Weighted Centroids allows researchers and analysts to estimate the center of population in a given geography rather than the geometric center. Data to estimate ZIP code centroids is extracted from administrative USPS address data.

How are ZIP codes determined?

ZIP Codes are numbered with the first digit representing a certain group of U.S. states, the second and third digits together representing a region in that group (or perhaps a large city) and the fourth and fifth digits representing a group of delivery addresses within that region.


1 Answers

Assuming that you wish to treat the coordinates of each zip code as equally weighted points within each state, and find the ZIP code closest to the effective "centre of mass", you can do it directly in MySQL by combining:

  • How to calculate geographical centre;

  • the Finding Locations with MySQL section of Google's article on Creating a Store Locator with PHP, MySQL & Google Maps; and

  • the general technique for obtaining a groupwise minimum/maximum.

The complete query is:

-- the closest ZIP code
SELECT state, z.zip_code
FROM   zip_codes z JOIN (

  -- the distance between the "centre of mass" and the closest ZIP code
  SELECT   state, c.lat, c.lon
           MIN(ACOS(
             COS(c.lat) * COS(RADIANS(z.lat)) * COS(RADIANS(z.lon) - c.lon)
           + SIN(c.lat) * SIN(RADIANS(z.lat))
           )) min
  FROM     zip_codes z JOIN (

    -- the "centre of mass" of each state
    SELECT   state, 
             ATAN2(
               SUM(SIN(RADIANS(lat))) / COUNT(*),
               SQRT(
                 POW(SUM(COS(RADIANS(lat)) * SIN(RADIANS(lon))) / COUNT(*), 2)
               + POW(SUM(COS(RADIANS(lat)) * COS(RADIANS(lon))) / COUNT(*), 2)
               )
             ) AS lat,
             ATAN2(
               SUM(COS(RADIANS(lat)) * SIN(RADIANS(lon))) / COUNT(*),
               SUM(COS(RADIANS(lat)) * COS(RADIANS(lon))) / COUNT(*)
             ) AS lon
    FROM     zip_codes
    GROUP BY state

  ) c USING (state)
  GROUP BY state

) d USING (state)
WHERE  ACOS(
         COS(d.lat) * COS(RADIANS(z.lat)) * COS(RADIANS(z.lon) - d.lon)
       + SIN(d.lat) * SIN(RADIANS(z.lat))
       ) = d.min

Notes

  1. This could prove pretty slow, as indexing (other than on the state column) will not be of much use, but then again it's a one-time only operation so cacheing the result shouldn't prove too much of a problem.

  2. There are many ZIP codes in densely populated areas, and few in sparsely populated areas. As a result, the determined "centre of mass" may be some distance away from the geographical centre (but it could be a reasonable proxy for the population centre, if that's what is desired).

    Adding a suitable weighting to each ZIP code would yield reasonable approximations: e.g. weight by the total land area each ZIP code covers to find geographical centre; or by population resident within each ZIP code to find actual population centre.

    The only way to have a true geographical centre would be to derive the centroid from each state's borders. You can download the coordinates of suitable bounding polygons from the CloudMade Downloads site.

like image 142
eggyal Avatar answered Sep 24 '22 18:09

eggyal