Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dimension for geozones or Lat & Long in data warehouse

I have a DimPlace dimension that has the name of the place (manually entered by the user) and the latitude and longitude of the place (automatically captured). Since the Places are entered manually the same place could be in there multiple time with different names, additionally, two distinct places could be very close to each other.

We want to be able to analyze the MPG between two "places" but we want to group them to make a larger area - i.e. using lat & long put all the various spellings of one location, as well as distinct but very close locations, in one record.

I am planning on making a new dimension for this - something like DimPlaceGeozone. I am looking for a resource to help with loading all the lat & long values mapped to ... something?? Maybe postal code, or city name? Sometimes you can find a script to load common dimensions (like DimTime) - I would love something similar for lat & long values in North America?

like image 688
tember Avatar asked Jul 25 '17 21:07

tember


1 Answers

I've done something similar in the past... The one stumbling block I hit up front was that 2 locations, straddling a border could be physically closer together than 2 locations that are both in the same area.

I got around it by creating a "double grid" system that causes each location to fall into 4 areas. That way 2 locations that share at least 1 "area" you know they are within range of each other.

Here's an example, covering most of the United States...

IF OBJECT_ID('tempdb..#LatLngAreas', 'U') IS NOT NULL 
DROP TABLE #LatLngAreas;
GO

WITH 
    cte_Lat AS (
        SELECT 
            t.n,
            BegLatRange = -37.9 + (t.n / 10.0),
            EndLatRange  = -37.7 + (t.n / 10.0)
        FROM
            dbo.tfn_Tally(1030, 0) t
        ),
    cte_Lng AS (
        SELECT 
            t.n,
            BegLngRange = -159.7 + (t.n / 10.0),
            EndLngRange = -159.5 + (t.n / 10.0)
        FROM
            dbo.tfn_Tally(3050, 0) t
        )
SELECT 
    Area_ID = ROW_NUMBER() OVER (ORDER BY lat.n, lng.n),
    lat.BegLatRange, 
    lat.EndLatRange, 
    lng.BegLngRange, 
    lng.EndLngRange
    INTO #LatLngAreas
FROM
    cte_Lat lat
    CROSS JOIN cte_Lng lng;


SELECT 
    b3.Branch_ID,
    b3.Name,
    b3.Lat,
    b3.Lng,
    lla.Area_ID
FROM
    dbo.ContactBranch b3    -- replace with DimPlace
    JOIN #LatLngAreas lla
        ON b3.Lat BETWEEN lla.BegLatRange AND lla.EndLatRange
        AND b3.lng BETWEEN lla.BegLngRange AND lla.EndLngRange;

HTH, Jason

like image 62
Jason A. Long Avatar answered Sep 30 '22 17:09

Jason A. Long