I have a geoJSON database with lots of polygons (census tracts specifically) and I have lots of long,lat points.
I am hoping that there would exist an efficient Python code to identify which census tract a given coordinate is in, however so far my googling hasn't revealed anything.
Thanks!
How to check if a point is inside a polygon in Python. To perform a Point in Polygon (PIP) query in Python, we can resort to the Shapely library's functions . within(), to check if a point is within a polygon, or . contains(), to check if a polygon contains a point.
Draw a horizontal line to the right of each point and extend it to infinity. Count the number of times the line intersects with polygon edges. A point is inside the polygon if either count of intersections is odd or point lies on an edge of polygon.
Select the Draw a polygon button, which similar to drawing a polyline, except that you need to complete the feature by making your final point at the same location as your initial point.
GeoJSON supports the following geometry types: Point , LineString , Polygon , MultiPoint , MultiLineString , and MultiPolygon . Geometric objects with additional properties are Feature objects.
A great option for working with these types of data is PostGIS, a spatial database extender for PostgreSQL. I personally keep all of my geo data in a PostGIS database, and then reference it in python using psycopg2
. I know it's not pure python, but it's got unbelievable performance benefits (discussed below) over pure python.
PostGIS has functionality built in to determine if a point or shape is within another shape. The good documentation on the ST_Within function expands upon this simple example:
SELECT ST_WITHIN({YOUR_POINT},boundary) FROM census; -- returns true or false for each of your tracts
The benifit you'll gain from PostGIS that you likely won't achieve elsewhere is indexing, which can improve your speed 1,000x [1], making it better than even the best written C program (unless the C program also creates an index for your data). The database, when properly setup, will cache information about your tracts, and when you ask if a point is within a tract, it won't have to search everything... it can take advantage of it's index.
Getting data into and out of PostGRES is pretty simple. A great tutorial that will walk you through the basics of PostGIS with sample datasets not too different from yours can be found here. It's reasonably long, but if you're new to PostGIS (as I was), you'll be very entertained and excited the entire time:
http://workshops.boundlessgeo.com/postgis-intro/
[1] Indexing decreased a nearest neighbor search in one of my huge databases (20 m from 53 seconds to 8.2 milliseconds.
I found an interesting article describing how to do exactly what you are looking to do.
TL;DR: Use Shapely
You will find this code at the end of the article:
import json from shapely.geometry import shape, Point # depending on your version, use: from shapely.geometry import shape, Point # load GeoJSON file containing sectors with open('sectors.json') as f: js = json.load(f) # construct point based on lon/lat returned by geocoder point = Point(-122.7924463, 45.4519896) # check each polygon to see if it contains the point for feature in js['features']: polygon = shape(feature['geometry']) if polygon.contains(point): print 'Found containing polygon:', feature
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With