I have a simple dataframe like
colC zipcode count
val1 71023 1
val2 75454 3
val3 77034 2
val2 78223 3
val2 91791 4
these are all US zipcodes.
I want to plot the zipcodes and the counts of values in colC on a map. For instance, zipcode 75454 has val2 in colC so it must have a different color than zipcode 71023 which has val1 in colC
Additionally I want to create a heatmap where the count column denotes the intensity of the heatmap across the map.
I went over some documentation for geopandas but looks like i have to convert the zipcodes to either some shape files or geojson in order to define the boundaries. I am not able to figure this step out.
Is geopandas the best tool to achieve this?
Any help is much appreciated
UPDATE
I was able to make some progress as
import pandas as pd
import pandas_bokeh
import matplotlib.pyplot as plt
import pgeocode
import geopandas as gpd
from shapely.geometry import Point
from geopandas import GeoDataFrame
pandas_bokeh.output_notebook()
nomi = pgeocode.Nominatim('us')
edf = pd.read_csv('myFile.tsv', sep='\t',header=None, index_col=False ,names=['colC','zipcode','count'])
edf['Latitude'] = (nomi.query_postal_code(edf['zipcode'].tolist()).latitude)
edf['Longitude'] = (nomi.query_postal_code(edf['zipcode'].tolist()).longitude)
geometry = [Point(xy) for xy in zip(edf['Longitude'], edf['Latitude'])]
gdf = GeoDataFrame(edf, geometry=geometry)
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
gdf.plot(ax=world.plot(figsize=(10, 6)), marker='o', color='red', markersize=15);
plt.savefig('world.jpg')
however, this gives me a map plot of the entire world. how can i reduce it to just show me the US as thats where all of my zipcodes are from?
turns out plotly is best suited for me
import pandas as pd
import pandas_bokeh
import matplotlib.pyplot as plt
import pgeocode
import geopandas as gpd
from shapely.geometry import Point
from geopandas import GeoDataFrame
pandas_bokeh.output_notebook()
import plotly.graph_objects as go
nomi = pgeocode.Nominatim('us')
edf = pd.read_csv('myFile.tsv', sep='\t',header=None, index_col=False ,names=['colC','zipcode','count'])
edf['Latitude'] = (nomi.query_postal_code(edf['zipcode'].tolist()).latitude)
edf['Longitude'] = (nomi.query_postal_code(edf['zipcode'].tolist()).longitude)
fig = go.Figure(data=go.Scattergeo(
lon = edf['Longitude'],
lat = edf['Latitude'],
text = edf['colC'],
mode = 'markers',
marker_color = edf['count'],
))
fig.update_layout(
title = 'colC Distribution',
geo_scope='usa',
)
fig.show()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With