I'm trying to combine multiple shapefiles by implementing the follwing:
import geopandas as gpd
import pandas as pd
for i in range(10,56):
interesting_files = "/Users/m3105/Downloads/area/tl_2015_{}_arealm.shp".format(i)
gdf_list = []
for filename in sorted(interesting_files):
gdf_list.append(gpd.read_file((filename)))
full_gdf = pd.concat(gdf_list)
in which the directory /Users/m3105/Downloads/area
has several shapefiles with such as tl_2015_01_arealm.shp
, tl_2015_02_arealm.shp
all the way up to tl_2015_56_arealm.shp
. I'd like to combine all of these shapefiles and avoid repeating their headers. However, whenever I try concating the files using the code above, I get the following error:
ValueError: Null layer: u''
Normally, I'd know how to concat csv files together but I'm note sure how to concat shapefiles. I'd greatly appreciate any help
These are often shapefiles, which can be opened in the formats.zip or.shp with geopandas.read_file (url_or_path). There are two possibilities: Directly create a mask from a geopandas GeoDataFrame or GeoSeries using mask_geopandas or mask_3D_geopandas.
GeoPandas supports writing and reading the Apache Parquet and Feather file formats. Apache Parquet is an efficient, columnar storage format (originating from the Hadoop ecosystem). It is a widely used binary file format for tabular data.
There are two ways to combine datasets in geopandas – attribute joins and spatial joins. In an attribute join, a GeoSeries or GeoDataFrame is combined with a regular pandas.Series or pandas.DataFrame based on a common variable. This is analogous to normal merging or joining in pandas.
GeoDataFrame can contain more field types than supported by most of the file formats. For example tuples or lists can be easily stored in the GeoDataFrame, but saving them to e.g. GeoPackage or Shapefile will raise a ValueError.
If using pandas.concat like the answer of @Paul H, some geographical imformation such as coordinate reference system(crs) does not get preserved by default. But it worked when using the way like below:
import os
import geopandas as gpd
import pandas as pd
file = os.listdir("Your folder")
path = [os.path.join("Your folder", i) for i in file if ".shp" in i]
gdf = gpd.GeoDataFrame(pd.concat([gpd.read_file(i) for i in path],
ignore_index=True), crs=gpd.read_file(path[0]).crs)
In this way, the geodataframe will have CRS as your need
I can't test this since I don't have your data, but you want something like this (assuming python 3):
from pathlib import Path
import pandas
import geopandas
folder = Path("/Users/m3105/Downloads/area")
shapefiles = folder.glob("tl_2015_*_arealm.shp")
gdf = pandas.concat([
geopandas.read_file(shp)
for shp in shapefiles
]).pipe(geopandas.GeoDataFrame)
gdf.to_file(folder / 'compiled.shp')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With