Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Concat multiple shapefiles via geopandas

I'm trying to combine multiple shapefiles by implementing the follwing:

import geopandas as gpd
import pandas as pd

for i in range(10,56):
    interesting_files = "/Users/m3105/Downloads/area/tl_2015_{}_arealm.shp".format(i)
    gdf_list = []
    for filename in sorted(interesting_files):
        gdf_list.append(gpd.read_file((filename)))
        full_gdf = pd.concat(gdf_list)

in which the directory /Users/m3105/Downloads/area has several shapefiles with such as tl_2015_01_arealm.shp, tl_2015_02_arealm.shp all the way up to tl_2015_56_arealm.shp. I'd like to combine all of these shapefiles and avoid repeating their headers. However, whenever I try concating the files using the code above, I get the following error:

ValueError: Null layer: u''

Normally, I'd know how to concat csv files together but I'm note sure how to concat shapefiles. I'd greatly appreciate any help

like image 270
M3105 Avatar asked Feb 19 '18 21:02

M3105


People also ask

How do I open a shapefile in geopandas?

These are often shapefiles, which can be opened in the formats.zip or.shp with geopandas.read_file (url_or_path). There are two possibilities: Directly create a mask from a geopandas GeoDataFrame or GeoSeries using mask_geopandas or mask_3D_geopandas.

What file formats are supported by geopandas?

GeoPandas supports writing and reading the Apache Parquet and Feather file formats. Apache Parquet is an efficient, columnar storage format (originating from the Hadoop ecosystem). It is a widely used binary file format for tabular data.

How to combine datasets in geopandas?

There are two ways to combine datasets in geopandas – attribute joins and spatial joins. In an attribute join, a GeoSeries or GeoDataFrame is combined with a regular pandas.Series or pandas.DataFrame based on a common variable. This is analogous to normal merging or joining in pandas.

Can geodataframe contain more than one field type?

GeoDataFrame can contain more field types than supported by most of the file formats. For example tuples or lists can be easily stored in the GeoDataFrame, but saving them to e.g. GeoPackage or Shapefile will raise a ValueError.


2 Answers

If using pandas.concat like the answer of @Paul H, some geographical imformation such as coordinate reference system(crs) does not get preserved by default. But it worked when using the way like below:

import os
import geopandas as gpd
import pandas as pd

file = os.listdir("Your folder")
path = [os.path.join("Your folder", i) for i in file if ".shp" in i]

gdf = gpd.GeoDataFrame(pd.concat([gpd.read_file(i) for i in path], 
                        ignore_index=True), crs=gpd.read_file(path[0]).crs)

In this way, the geodataframe will have CRS as your need

like image 62
lemon Avatar answered Oct 20 '22 06:10

lemon


I can't test this since I don't have your data, but you want something like this (assuming python 3):

from pathlib import Path
import pandas
import geopandas

folder = Path("/Users/m3105/Downloads/area")
shapefiles = folder.glob("tl_2015_*_arealm.shp")
gdf = pandas.concat([
    geopandas.read_file(shp)
    for shp in shapefiles
]).pipe(geopandas.GeoDataFrame)
gdf.to_file(folder / 'compiled.shp')
like image 23
Paul H Avatar answered Oct 20 '22 08:10

Paul H