Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Exporting a Geopandas dataframe to a zipped shapefile directly

I'm trying to save Geopandas data frame into a shapefile that is written to a zipped folder directly.

As any shapefile user knows, a shapefile is not a single file but rather a collection of files that are meant to be read together. So calling myGDF.to_file(filename='myshapefile.shp', driver='ESRI Shapefile') creates not only myshapefile.shp but also myshapefile.prj, myshapefile.dbf, myshapefile.shx and myshapefile.cpg. This is probably why I am struggling to get the syntax right here.

Consider for instance a dummy Geopandas Dataframe like:

import pandas as pd
import geopandas as gpd
from shapely.geometry import Point

data = pd.DataFrame({'name': ['a', 'b', 'c'],
    'property': ['foo', 'bar', 'foo'],
        'x': [173994.1578792833, 173974.1578792833, 173910.1578792833],
        'y': [444135.6032947102, 444186.6032947102, 444111.6032947102]})
geometry = [Point(xy) for xy in zip(data['x'], data['y'])]
myGDF = gpd.GeoDataFrame(data, geometry=geometry)

I saw people using gzip, so I tried:

import geopandas as gpd
myGDF.to_file(filename='myshapefile.shp.gz', driver='ESRI Shapefile',compression='gzip')

But it did not work.

Then I tried the following (in a Google Colab environment):

import zipfile
pathname = '/content/'
filename = 'myshapefile.shp'
zip_file = 'myshapefile.zip'
with zipfile.ZipFile(zip_file, 'w') as zipf:
   zipf.write(myGDF.to_file(filename = '/content/myshapefile.shp', driver='ESRI Shapefile'))

But it only saves the .shp file in a zip folder, while the rest is written next to the zip folder.

How can I write a Geopandas DataFrame as a zipped shapefile directly?

like image 610
saQuist Avatar asked Apr 14 '21 08:04

saQuist


2 Answers

Simply use zip as a file extension, keeping the name of the driver:

myGDF.to_file(filename='myshapefile.shp.zip', driver='ESRI Shapefile')

This should work with GDAL 3.1 or newer.

like image 120
GreatEmerald Avatar answered Sep 25 '22 14:09

GreatEmerald


Something like this would work for you - dump the shapefile(s) to a fresh new tempdir and then zip up everything inside that tempdir.

import tempfile
import zipfile
from pathlib import Path

with tempfile.TemporaryDirectory() as temp_dir:

    temp_dir = Path(temp_dir)

    # geodataframe.to_file(str(d / "myshapefile.shp"))
    with open(temp_dir / "a.shp", "w") as _f:
        _f.write("blah")
    with open(temp_dir / "a.prj", "w") as _f:
        _f.write("blah")

    with zipfile.ZipFile('myshapefile.zip', 'w') as zipf:
        for f in temp_dir.glob("*"):
            zipf.write(f, arcname=f.name)
like image 44
fishstix44 Avatar answered Sep 25 '22 14:09

fishstix44