Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Speeding up checking of IP address membership in CIDR ranges, for large datasets

In a Postgres DB, I need to filter a set of several hundred thousand rows in a table A by including only those rows for which an IP address column (of type inet) in the row matches any of several thousand IP address blocks (of type cidr) in another table B. I've tried various indexes on the inet addresses in the first table and the cidr ranges in the second, but no matter what I do, the planner does a nested sequential scan, applying the << operator to every pair of IP addresses and prefixes.

Is there a way to speed this up with indexes or other clever tricks? (I can resort to external procedural scripting, but I was wondering if it's doable within Postgres.)

Thanks!

like image 214
Christian Avatar asked Sep 11 '13 23:09

Christian


3 Answers

This is an old question but prominent in Google results, so posting my 2 cents here:

With Postgres 9.4 and later you can use GIST indexes for inet and cidr: https://www.postgresql.org/docs/current/static/gist-builtin-opclasses.html

E.g. the following query will use the gist index (assuming a table from MaxMind's free dataset):

create index on geolite2_city_ipv4_block using gist (network inet_ops);

select * from geolite2_city_ipv4_block where network >>= '8.8.8.8';
like image 52
Magnus Hiie Avatar answered Sep 28 '22 13:09

Magnus Hiie


Have you looked at ip4r? http://pgfoundry.org/projects/ip4r. IIRC, it is really fast for INET-related lookups.

like image 21
bma Avatar answered Sep 28 '22 14:09

bma


Case closed. To make things fast, do the following:

  • Use the ip4r types available from http://pgfoundry.org/projects/ip4r, as pointed out by user bma. This type supports indexing where Postgres's (up to Postgres 9.3) native ones don't.

  • Do not use the ip4r type directly, but expand it into lower and upper values as suggested by user caskey and mentioned in the ip4r docs: https://github.com/petere/ip4r-cvs/blob/master/README.ip4r#L187

Given the above, if you're using type ip4 (assuming you're dealing with v4 addresses) for all compared addresses, then the planner will leverage indexes on those columns.

Thanks for the help, guys!

like image 22
Christian Avatar answered Sep 28 '22 13:09

Christian