i have a postgres database with millions of rows in it it has a column called geom which contains the boundary of a property.
using a python script i am extracting the information from this table and re-inserting it into a new table.
when i insert in the new table the script bugs out with the following:
Traceback (most recent call last):
File "build_parcels.py", line 258, in <module>
main()
File "build_parcels.py", line 166, in main
update_cursor.executemany("insert into parcels (par_id, street_add, title_no, proprietors, au_name, ua_name, geom) VALUES (%s, %s, %s, %s, %s, %s, %s)", inserts)
psycopg2.IntegrityError: new row for relation "parcels" violates check constraint "enforce_geotype_geom"
The new table has a check constraint enforce_geotype_geom = ((geometrytype(geom) = 'POLYGON'::text) OR (geom IS NULL)) whereas the old table does not, so im guessing theres dud data or non polygon (perhaps multipolygon data?) in the old table. i want to keep the new data as polygon so want to not insert anything else.
Initially i tried wrapping the query with standard python error handling with the hope that the dud geom rows would fail but the script would keep running , but the script has been written to commit at the end not each row so it doesnt work.
I think what i need to do is iterate through the old table geom rows and check what type of geometry they are so i can establish whether or not i want to keep it or throw it away before i insert into the new table
Whats the best way of going about this?
This astonishingly useful bit of PostGIS SQL should help you figure it out... there are many geometry type tests in here:
-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
--
-- $Id: cleanGeometry.sql 2008-04-24 10:30Z Dr. Horst Duester $
--
-- cleanGeometry - remove self- and ring-selfintersections from
-- input Polygon geometries
-- http://www.sogis.ch
-- Copyright 2008 SO!GIS Koordination, Kanton Solothurn, Switzerland
-- Version 1.0
-- contact: horst dot duester at bd dot so dot ch
--
-- This is free software; you can redistribute and/or modify it under
-- the terms of the GNU General Public Licence. See the COPYING file.
-- This software is without any warrenty and you use it at your own risk
--
-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
CREATE OR REPLACE FUNCTION cleanGeometry(geometry)
RETURNS geometry AS
$BODY$DECLARE
inGeom ALIAS for $1;
outGeom geometry;
tmpLinestring geometry;
Begin
outGeom := NULL;
-- Clean Process for Polygon
IF (GeometryType(inGeom) = 'POLYGON' OR GeometryType(inGeom) = 'MULTIPOLYGON') THEN
-- Only process if geometry is not valid,
-- otherwise put out without change
if not isValid(inGeom) THEN
-- create nodes at all self-intersecting lines by union the polygon boundaries
-- with the startingpoint of the boundary.
tmpLinestring := st_union(st_multi(st_boundary(inGeom)),st_pointn(boundary(inGeom),1));
outGeom = buildarea(tmpLinestring);
IF (GeometryType(inGeom) = 'MULTIPOLYGON') THEN
RETURN st_multi(outGeom);
ELSE
RETURN outGeom;
END IF;
else
RETURN inGeom;
END IF;
------------------------------------------------------------------------------
-- Clean Process for LINESTRINGS, self-intersecting parts of linestrings
-- will be divided into multiparts of the mentioned linestring
------------------------------------------------------------------------------
ELSIF (GeometryType(inGeom) = 'LINESTRING') THEN
-- create nodes at all self-intersecting lines by union the linestrings
-- with the startingpoint of the linestring.
outGeom := st_union(st_multi(inGeom),st_pointn(inGeom,1));
RETURN outGeom;
ELSIF (GeometryType(inGeom) = 'MULTILINESTRING') THEN
outGeom := multi(st_union(st_multi(inGeom),st_pointn(inGeom,1)));
RETURN outGeom;
ELSIF (GeometryType(inGeom) = '<NULL>' OR GeometryType(inGeom) = 'GEOMETRYCOLLECTION') THEN
RETURN NULL;
ELSE
RAISE NOTICE 'The input type % is not supported %',GeometryType(inGeom),st_summary(inGeom);
RETURN inGeom;
END IF;
End;$BODY$
LANGUAGE 'plpgsql' VOLATILE;
Option 1 is to create a savepoint before each insert and roll back to that safepoint if an INSERT
fails.
Option 2 is to attach the check constraint expression as a WHERE
condition on the original query that produced the data to avoid selecting it at all.
The best answer depends on the size of the tables, the relative number of faulty rows, and how fast and often this is supposed to run.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With