Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

display data on real map based on postal code

I try to display data on a real map (data should be within Ontario, Canada): I have .csv file with two columns, A and B, A is a postal code, B is its associated value (integer, 1 to 5), I want to find the area that A maps to and color it based on the associated value. For example. postal code P0G, P0A (the first 3 digits of Canadian postal code, which represents an area) with associated values 2 and 5, I want to display different colors (maybe 2 with green and 5 with red) on these two areas on a real map.

I don't know how tbh, maybe using Python or some website service or some API? I don't have experience with visualizing data on real map before.

The question I described above is a simplified version. The actual question has more columns (domains), one postal code for each row, I need to have |domains| number of graphs/maps being generated for each domain. But I figured once I know how to do one domain, then I should be able to do all.

I tried to use some online analytical tool like SimplyAnalytics but didn't figure out how to display many areas with different colors at the same time. I tried to search similar questions but it seems because of different end goals and format of data, so the change of code/method may vary vastly.

Thanks!!!

Some sample data will be: enter image description here

like image 694
Kenny Avatar asked Sep 21 '19 20:09

Kenny


2 Answers

If you are looking to do it in python, you could could use the geopandas library. Below is a sample code (Github Gist). First we need to get the shapefile which defines the area for each Postal Code (Forward Sortation Area) (link). Then filter postal codes for Ontario and join it with the data you want to plot.

import geopandas
import pandas as pd
import pandas_bokeh
import matplotlib.pyplot as plt
pandas_bokeh.output_notebook()

canada = geopandas.read_file("./gfsa000b11a_e.shp")
ontario = canada[canada['PRUID'] == '35']

# Sample data to plot
df=pd.DataFrame({'PCODE': ['P0V','P0L','P0T','P0Y', 'P0G', 'P2N'], 'A':[6,3,5,2,2,4] })

# Join ontario dataset with sample data
new_df=ontario.join(df.set_index('PCODE'), on='CFSAUID')


new_df.plot_bokeh(simplify_shapes=20000,
                  category="A", 
                  colormap="Spectral", 
                  hovertool_columns=["CFSAUID","A"])

enter image description here

like image 197
Samik Avatar answered Sep 28 '22 18:09

Samik


@Samik's answer is great, it works perfectly on 3-digit postal code. However, for 6-digit, plot_bokeh is really slow. In my case, Ontario boundary shapfile took 21 hours to render (I timed it in Python, maybe my machine is slow)!!! If you have multiple domains, it will be 21*|domains| hs, time will be a huge issue.

A better way for 6-digit (large files in general), use Tableau , load spatial files and render map, select proper parameter to customize your map, it will be way quicker than plot_bokeh; however, using Tableau doesn't involve programming, it suits better for general users.

like image 21
Kenny Avatar answered Sep 28 '22 20:09

Kenny