I'm trying to save Geopandas data frame into a shapefile that is written to a zipped folder directly.
As any shapefile user knows, a shapefile is not a single file but rather a collection of files that are meant to be read together. So calling myGDF.to_file(filename='myshapefile.shp', driver='ESRI Shapefile')
creates not only myshapefile.shp
but also myshapefile.prj
, myshapefile.dbf
, myshapefile.shx
and myshapefile.cpg
. This is probably why I am struggling to get the syntax right here.
Consider for instance a dummy Geopandas Dataframe like:
import pandas as pd
import geopandas as gpd
from shapely.geometry import Point
data = pd.DataFrame({'name': ['a', 'b', 'c'],
'property': ['foo', 'bar', 'foo'],
'x': [173994.1578792833, 173974.1578792833, 173910.1578792833],
'y': [444135.6032947102, 444186.6032947102, 444111.6032947102]})
geometry = [Point(xy) for xy in zip(data['x'], data['y'])]
myGDF = gpd.GeoDataFrame(data, geometry=geometry)
I saw people using gzip
, so I tried:
import geopandas as gpd
myGDF.to_file(filename='myshapefile.shp.gz', driver='ESRI Shapefile',compression='gzip')
But it did not work.
Then I tried the following (in a Google Colab environment):
import zipfile
pathname = '/content/'
filename = 'myshapefile.shp'
zip_file = 'myshapefile.zip'
with zipfile.ZipFile(zip_file, 'w') as zipf:
zipf.write(myGDF.to_file(filename = '/content/myshapefile.shp', driver='ESRI Shapefile'))
But it only saves the .shp
file in a zip folder, while the rest is written next to the zip folder.
How can I write a Geopandas DataFrame as a zipped shapefile directly?
Simply use zip
as a file extension, keeping the name of the driver:
myGDF.to_file(filename='myshapefile.shp.zip', driver='ESRI Shapefile')
This should work with GDAL 3.1 or newer.
Something like this would work for you - dump the shapefile(s) to a fresh new tempdir and then zip up everything inside that tempdir.
import tempfile
import zipfile
from pathlib import Path
with tempfile.TemporaryDirectory() as temp_dir:
temp_dir = Path(temp_dir)
# geodataframe.to_file(str(d / "myshapefile.shp"))
with open(temp_dir / "a.shp", "w") as _f:
_f.write("blah")
with open(temp_dir / "a.prj", "w") as _f:
_f.write("blah")
with zipfile.ZipFile('myshapefile.zip', 'w') as zipf:
for f in temp_dir.glob("*"):
zipf.write(f, arcname=f.name)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With