I'm kind of new to python, so I'm hoping that the answer to my question is relatively straight forward.
I'm trying to make a choropleth map using geopandas. However, since I'm making multiple maps that need to be compared to each other, it is indispensable that I use a custom data classification scheme (rather than quantiles or jenks). Hence, I've been trying to work with the User_Defined scheme, and I'm able to create the bins but I don't know how to apply them to the map itself.
This is what I did to create my classification scheme:
import pysal.esda.mapclassify as ps
from pysal.esda.mapclassify import User_Defined
bins = [5, 20, 100, 600, 1000, 3000, 5000, 10000, 20000, 400000]
ud = User_Defined(projected_world_exports['Value'], bins)
(where 'Value' is the column I plot in the map)
And then when I try to plot the choropleth map I don't know what the scheme is meant to be called
projected_world_exports.plot(column='Value', cmap='Greens', scheme = ?????)
If anyone could help I would be hugely appreciative!
Thanks x
geopandas makes it easy to create Choropleth maps (maps where the color of each shape is based on the value of an associated variable). Simply use the plot command with the column argument set to the column whose values you want used to assign colors.
GeoPandas is an open source project to make working with geospatial data in python easier. GeoPandas extends the datatypes used by pandas to allow spatial operations on geometric types. Geometric operations are performed by shapely. Geopandas further depends on fiona for file access and matplotlib for plotting.
A choropleth map is a thematic map that is used to represent statistical data using the color mapping symbology technique. It displays enumeration units, or divided geographical areas or regions that are colored, shaded or patterned in relation to a data variable.
A choropleth map displays divided geographical areas or regions that are coloured in relation to a numeric variable. This section provides many examples build with R. It focuses on the leaflet package for interactive versions, and the ggplot2 for static ones.
Here is an alternative approach that does not require modifying the geopandas code. It involves first labeling the bins so that you can create a custom colormap that maps each bin label to a specific color. A column must then be created in your geodataframe that specifies which bin label is applied to each row in the geodataframe, and this column is then used to plot the choropleth using the custom colormap.
from matplotlib.colors import LinearSegmentedColormap
bins = [5, 20, 100, 600, 1000, 3000, 5000, 10000, 20000, 400000]
# Maps values to a bin.
# The mapped values must start at 0 and end at 1.
def bin_mapping(x):
for idx, bound in enumerate(bins):
if x < bound:
return idx / (len(bins) - 1.0)
# Create the list of bin labels and the list of colors
# corresponding to each bin
bin_labels = [idx / (len(bins) - 1.0) for idx in range(len(bins))]
color_list = ['#edf8fb', '#b2e2e2', '#66c2a4', '#2ca25f', '#006d2c', \
'#fef0d9', '#fdcc8a', '#fc8d59', '#e34a33', '#b30000']
# Create the custom color map
cmap = LinearSegmentedColormap.from_list('mycmap',
[(lbl, color) for lbl, color in zip(bin_labels, color_list)])
projected_world_exports['Bin_Lbl'] = projected_world_exports['Value'].apply(bin_mapping)
projected_world_exports.plot(column='Bin_Lbl', cmap=cmap, alpha=1, vmin=0, vmax=1)
I took a look at the code of geopandas plotting function (https://github.com/geopandas/geopandas/blob/master/geopandas/plotting.py) but I guess the plot
method only accepts one of the three name ("quantiles", "equal_interval", "fisher_jenks") but not directly a list of bins or a pysal.esda.mapclassify
classifier such as User_Defined
.
(I guess it could be linked to that issue where the last comment is about defining an API for "user defined" binning).
However for now I guess you can achieve this by slightly modifying and reusing the functions from the file I linked.
For example you could rewrite you're own version of plot_dataframe
like this :
import numpy as np
def plot_dataframe(s, column, binning, cmap,
linewidth=1.0, figsize=None, **color_kwds):
import matplotlib.pyplot as plt
values = s[column]
values = np.array(binning.yb)
fig, ax = plt.subplots(figsize=figsize)
ax.set_aspect('equal')
mn = values.min()
mx = values.max()
poly_idx = np.array(
(s.geometry.type == 'Polygon') | (s.geometry.type == 'MultiPolygon'))
polys = s.geometry[poly_idx]
if not polys.empty:
plot_polygon_collection(ax, polys, values[poly_idx], True,
vmin=mn, vmax=mx, cmap=cmap,
linewidth=linewidth, **color_kwds)
plt.draw()
return ax
Then you would need to define the functions _flatten_multi_geoms and plot_polygon_collection by copying them and you are ready to use it like this :
bins = [5, 20, 100, 600, 1000, 3000, 5000, 10000, 20000, 400000]
ud = User_Defined(projected_world_exports['Value'], bins)
plot_dataframe(projected_world_exports, 'Value', ud, 'Greens')
This can be done easily using UserDefined scheme. While defining such scheme, a mapclassify.MapClassifier object will be used under the hood. In fact, all the supported schemes are provided by mapclassify.
For passing your bins, you need to pass them in classification_kwds arguments.
So, your code is going to be:
projected_world_exports.plot(
column='Value',
cmap='Greens',
scheme='UserDefined',
classification_kwds={'bins': bins}
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With