Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Invalid POLYGON bigQuery

I have a column with POLYGON strings, and before processing the geo data I need to convert them with ST_GEOGFROMTEXT. However, I probably have some rows with invalid polygons and I get the following error

Error: ST_GeogFromText failed: Invalid polygon loop: Edge 0 has duplicate vertex with edge 4025

This is my query

SELECT st_geogfromtext(string_field_1)
FROM t

Is there a way to deal with incorrect polygons, or at least to identify which row returns the problem?

like image 468
Alex Avatar asked Dec 06 '18 22:12

Alex


People also ask

What is Google’s polygon BigQuery?

The initiative aims to provide developers, data analysts, and crypto-enthusiasts with a better understanding of the Polygon blockchain. Google’s BigQuery would ensure that query and in-depth analysis of Polygon’s on-chain data are conducted in a simple and organized manner using the Google Cloud Platform.

How to fix invalid polygons in BigQuery GIS?

A practical motivating example here is being able to fix invalid polygons that don’t conform to OGC spec and cannot be loaded to BigQuery. Unfortunately, such bad data can be found in various datasets one has to work with. You can often fix it using ST_MakeValid function in PostgreSQL, but there is no such facility yet in BigQuery GIS.

Why is my BigQuery query invalid?

This example is invalid because the value of the number falls outside the range of BIGNUMERIC: This example is invalid because the string contains invalid characters: When using CAST, a query can fail if BigQuery is unable to perform the cast.

How to fix BigQuery GIS “spikes”?

You can often fix it using ST_MakeValid function in PostgreSQL, but there is no such facility yet in BigQuery GIS. Let’s call this function from BigQuery! We’ll use SanFrancisco zoning dataset. Several zone polygons there have self intersecting loops and are not valid polygons. These “spikes” are probably artifacts of drawing tools. Let’s fix them.


2 Answers

Use prefix SAFE:

SELECT SAFE.st_geogfromtext(x), x
FROM (SELECT "bad" x)

null     bad
like image 66
Felipe Hoffa Avatar answered Oct 19 '22 12:10

Felipe Hoffa


There is now a way to fix this problem in most cases, by using make_valid parameter:

https://cloud.google.com/bigquery/docs/reference/standard-sql/geography_functions#st_geogfromtext_signature2

Here is a query that find the invalid polygons, and attempt to fix it:

SELECT 
  st_geogfromtext(string_field_1, make_valid => TRUE) as geog,
  SAFE.st_geogfromtext(string_field_1) IS NULL as geog_needed_fix
FROM t
like image 40
Michael Entin Avatar answered Oct 19 '22 12:10

Michael Entin