I am trying to read in a shapefile into a GeoDataFrame.
Normally I just do this and it works:
import pandas as pd
import geopandas as gpd
from shapely.geometry import Point
df = gpd.read_file("wild_fires/nbac_2016_r2_20170707_1114.shp")
But this time it gives me the error: b'Recode from ANSI 1252 to UTF-8 failed with the error: "Invalid argument".'
Full error:
---------------------------------------------------------------------------
CPLE_AppDefinedError Traceback (most recent call last)
<ipython-input-14-adcad0275d30> in <module>()
----> 1 df_wildfires_2016 = gpd.read_file("wild_fires/nbac_2016_r2_20170707_1114.shp")
/usr/local/lib/python3.6/site-packages/geopandas/io/file.py in read_file(filename, **kwargs)
19 """
20 bbox = kwargs.pop('bbox', None)
---> 21 with fiona.open(filename, **kwargs) as f:
22 crs = f.crs
23 if bbox is not None:
/usr/local/lib/python3.6/site-packages/fiona/__init__.py in open(path, mode, driver, schema, crs, encoding, layer, vfs, enabled_drivers, crs_wkt)
163 c = Collection(path, mode, driver=driver, encoding=encoding,
164 layer=layer, vsi=vsi, archive=archive,
--> 165 enabled_drivers=enabled_drivers)
166 elif mode == 'w':
167 if schema:
/usr/local/lib/python3.6/site-packages/fiona/collection.py in __init__(self, path, mode, driver, schema, crs, encoding, layer, vsi, archive, enabled_drivers, crs_wkt, **kwargs)
151 if self.mode == 'r':
152 self.session = Session()
--> 153 self.session.start(self)
154 elif self.mode in ('a', 'w'):
155 self.session = WritingSession()
fiona/ogrext.pyx in fiona.ogrext.Session.start (fiona/ogrext2.c:8432)()
fiona/_err.pyx in fiona._err.GDALErrCtxManager.__exit__ (fiona/_err.c:1861)()
CPLE_AppDefinedError: b'Recode from ANSI 1252 to UTF-8 failed with the error: "Invalid argument".'
I've been trying to figure out why I am getting the error for a while but can't seem to find the answer.
The data was obtained from this webpage I downloaded only the 2016 link: http://cwfis.cfs.nrcan.gc.ca/datamart/download/nbac?token=78e9bd6af67f71204e18cb6fa4e47515
Would anybody be able to help me? Thank you.
Seems that your shapefile contains non-UTF characters that causes the Fiona.open()
call to fail (geopandas uses Fiona to open files).
What I did that solved this error was to open the Shapefile (with QGis for example), then selecting save as
, and specifying the Encoding
option as "UTF-8":
After doing this, I got no error when calling df = gpd.read_file("convertedShape.shp")
.
Another way to do this without having to use QGis or similar, is to read and save your Shapefile again (effectively converting to the desired format). With OGR you can do something like this:
from osgeo import ogr
driver = ogr.GetDriverByName("ESRI Shapefile")
ds = driver.Open("nbac_2016_r2_20170707_1114.shp", 0) #open your shapefile
#get its layer
layer = ds.GetLayer()
#create new shapefile to convert
ds2 = driver.CreateDataSource('convertedShape.shp')
#create a Polygon layer, as the one your Shapefile has
layer2 = ds2.CreateLayer('', None, ogr.wkbPolygon)
#iterate over all features of your original shapefile
for feature in layer:
#and create a new feature on your converted shapefile with those features
layer2.CreateFeature(feature)
ds = layer = ds2 = layer2 = None
This also enabled to successfully open with df = gpd.read_file("convertedShape.shp")
after conversion. Hope this helps.
with fiona.open(file, encoding="UTF-8") as f:
worked for me.
Since you have GDAL installed, I recommend converting the file to UTF-8 using the CLI:
ogr2ogr output.shp input.shp -lco ENCODING=UTF-8
Worked like a charm for me. It's much faster than QGIS or Python and can be applied in a cluster environment.
As an extension to this answer, you can pass fiona arguments through geopandas read_file:
df = gpd.read_file("filename", encoding="utf-8")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With